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The present invention features isolated DNA encoding allergens of Dermatophagoides (house dust mites) particularly of 
the species Dermatophagoides farinae and Dermatophagoides pteronyssinus, which are protein allergens or peptides which in- 
clude at least one epitope of the protein allergen. In particular, the invention provides DNA encoding the major D. farinae allerg- 
ens, Der/l and Der /II and DNA encoding the major D. pteronyssinus allergens, Derp I and Der p II. The present invention 
further relates to proteins and peptides encoded by the isolated D. farinae and D. pteronyssinus DNA, including proteins con- 
taining sequence polymorphisms. In addition, the proteins or peptides encoded by the isolated DNA, their use as diagnostic and 
therapeutic reagents and methods of diagnosing and treating sensitivity to house dust mite allergens, are disclosed. 
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CIONTNG AND SEQUENCING OF ALLERGENS 
OF DERMATOPHAGOTDES (HOUSE DUST MITE> 

S Description 
Background 

Recent reports have documented the importance of responses to the 
Group I and Group II allergens in house dust mite allergy. For example, it has 

10 beefrdocumented that over 60% of patients have at least 50% of their anti-mite 
antibodies directed towards these proteins (Lind, P. et al.. Allergy, 22:259-274 
(1984); van der Zee, J.S. fiLaL, J. Allergy Clin . Immunol.. £1:884-896 (1988)). It 
is possible that children show a greater degree of reactivity (Thompson, P;J. ejLaL 
Immunolog y 6.4:3 1 1-3 14 (1988)). Allergy to mites of the genus 

15 Dermatophagoides (D_.) is associated with conditions such as asthma, rhinitis and 
ectopic dermatitis. Two species, D. pteronvssinus and P, farinae, predominate 
and, as a result, considerable effort has been expended in trying to identify the 
allergens produced by these two species. D. pteronvssinus mites are the most 
common Dermatophagoides species in house dust in Western Europe and 

20 Australia. The species D. farinae predominates in other countries, such as North 
America and Japan (Wharton, G.W., J Medical F.ntom. 12:577-621 (1976)). It 
has long been recognized that allergy to mites of this genus is associated with 
diseases such as asthma, rhinitis and atopic dermatitis. It is still not clear what 
allergens produced by these mites are responsible for the allergic response and 

23 associated conditions. 

Summary of the Invention 

The present invention relates to isolated DNA which encodes a 

protein allergen of Dermatophagoides ((D_.) house dust mite) or a peptide which 
30 includes at least one epitope of a protein allergen of a house dust mite of the genus 

Dermatophagoides . It particularly relates to DNA encoding major allergens of the 

species D. farinae . designated P_er_f I and Dsr_f n, or portions of these major 

allergens (i.e., peptides which include at least one epitope of Dsr_f I or of D_gr_f II). 

It also particularly relates to DNA encoding major allergens of D. pteronvssinus. 
35 designated Derp I and Derp II, or portions of these major allergens (i.e., peptides 

which include at least one epitope of Dsrj2 1 or of D£TJ2 II). 
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The present invention further relates to proteins and peptides encoded by 
the isolated Permato phagoides (e.g., P. farinae, P. pteronyssinus) DNA including 
proteins containing sequence polymorphisms. Several nucleotide and resulting amino 
acid sequence polymorphisms have been discovered in the Perp I, Perp II and Dsnf 

5 II allergens. All such nucleotide variations and proteins, or portions thereof, 
containing a sequence polymorphism are within the scope of the invention. 

Peptides of the present invention include at least one epitope of a 
D, farinae allergen (e.g., at least one epitope of Denf I or Denf II) or at least one 
^epitope of a n. pteronvssinus allergen (e.g., at least one epitope of DfiLP I or of DsLg 

10 II). It also relates to antibodies specific for P farinae proteins or peptides and to 
/ antibodies specific for P pteronvssinus proteins or peptides. 

Permatophagoides DNA, proteins and peptides of the present invention 
are useful for diagnostic and therapeutic purposes. For example, isolated P, farinae 
proteins or peptides can be used to detect sensitivity in an individual Jto house dust 

15 mites and can be used to treat sensitivity (reduce sensitivity or desensitize) in an 
individual, to whom therapeutically effective quantities of the P. farinae protein or 
peptide is administered. For example, isolated D. farinae protein allergen, such as 
Per f I or Per f IL can be administered periodically, using standard techniques, to an 
individual in order to desensitize the individual. Alternatively, a peptide which 

20 includes at least one epitope of Per f I or of Per f II can be administered for this 

purpose. Isolated D- pteronvssinus protein allergen, such as Perp I or Perp II, can be 
administered as described for Deril or Denf II. Similarly, a peptide which includes 
at least one Perp I epitope or at least one Perp II epitope can be administered for this 
purpose. A combination of these proteins or peptides (e.g., DslI I and DfiLf H; Perp 

25 I and Perp II; or a mixture oTboth Deil and Perp proteins) can also be administered. 
The use of such isolated proteins or peptides provides a means of desensitizing 
individuals to important house dust mite allergens. 

Brief Pescri ption of the Prawines 

30 Figures 1A and IB show the nucleotide and predicted amino acid 

sequence of cPNA ggtl 1 pl(13T) (SEQ ID NOS: 1 and 2, respectively). Numbers to 
the right are nucleotide positions whereas numbers above the sequence are amino acid 
positions. Positive amino acid residue numbers correspond to the sequence of the 
mature excreted Perp I beginning with threonine. Negative sequence numbers refer 

35 to the proposed transient pre- and preproenzyme forms of QgLB. I- The arrows 

indicate the beginning of the proposed proenzyme sequence and the mature Perp I, 
respectively. Residues -15 to -13 enclosed by an open box make 
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up the proposed cleavage for the proenzyme formation, and the dashed residues 52-54 
represent a potential N-glycosylation site. The termination TAA codon and the 
adjacent polyadenylation signal are underlined. Amino acid residues 1-41, 79-95, 
1 1 1-142, and 162-179 correspond to known tryptic peptide sequences determined by 

5 conventional amino acid sequencing analysis. 

Fi gure 2 shows the restriction map of the cDNA insert of clone ggtl 1 
pl(13T) and the strategy of DNA sequencing. Arrows indicate directions in 
which sequences were read. 

~~ Figure 3 is a comparison of N-terminal sequences of DfiLE I and 

10 Dsnf I. The amino acid sequence for DfiLp I is equivalent to amino acids 1-20 in 
Figures 1 A and IB; the DeLf I sequence is from reference (12). 

Fi gure 4 shows the reactivity of ggtl 1 pl(13T) with anti-DSLP I. 
Lysates from Y1089 lysogens induced for phage were reacted by dot-blot with 
rabbit anti-DfiL£ I ( Per p I) or normal rabbit serum (Nrs). Dots (2ml) were made 

15 in triplicate from lysates of bacteria infected with ggtl 1 pl(13T) (a) or ggtl 1 (b). 
When developed with 125i_p ro t e in A and autoradiography only the reaction 
between ggt 1 1 p 1 ( 1 3T) lysate and the anti-Dem I showed reactivity. 

Fi gure 5 shows reaction of clone pGEX-pl(13T) with IgE in allergic 
serum. Overnight cultures of pGEX or pGEX-pl where diluted 1/10 in broth and 

20 grown for 2 hours at 37°C. They were induced with IPTG, grown for 2 hours at 
37°C. The bacteria were pelletted and resuspended in PBS to 1/10 the volume of 
culture media. The bacteria were lysed by freeze/thaw and sonication. A 
radioimmune dot-blot was performed with 2ml of these lysates using mite-allergic 
or non-allergic serum. The dote in row 1 were from EL coli containing pGEX and 

25 row 2-4 from different cultures of soli infected with pGEX-pl(13T). 

Reactivity to pGEX-pl(13T) was found with IgE in allergic but not non-allergic 
serum. No reactivity to the vector control or with non-allergic serum was found. 

Fi gure 6 shows seroreactivity of cDNA clones coding for Per p II in 
plaque radioimmune assay. Segments of nitrocellulose filters from plaque lifts 

30 were taken from clones 1, 3, A, B and the vector control Ampl. These were 
reached by immunoassay for human IgE against allergic serum (AM) in row 1, 
non-allergic serum (WT) in row 2 and by protein A immunoassay for Per p I with 
rabbit antiserum in row 3. The clones 1, 3 and B reacted strongly with allergic 
serum but not non-allergic or vector control. (Clone B and vector control were 

35 not tested with non-allergic serum). 
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Fi gures 7 A and 7B show the nucleotide and predicted amino acid 
sequence of cDNA of ggtl 1 p II (CI) (SEQ ID NOS: 3 and 4, respectively). Numbers 
to the right are nucleotide positions and numbers above are amino acid positions. 
Positive numbers for amino acids begin at the known N-terminal of QfiLp II and 
5 match the known sequence of the first 40 residues. Residues -1 to -16 resemble a 
typical leader sequence with a hydrophobic core. 

Figure 8 shows the N-terminal amino acid homology of Per p II and Per f 
IL (Per f II sequence from reference 30). 

^ a ,— Fi gure 9 is a restriction map of the cDNA insert of clone ggt 1 1 f 1, 

10 including a schematic representation of the strategy of DNA sequencing. Arrows 
indicate directions in which sequences were read. 

Fi gures 10A and 1 OB are the nucleotide sequence and the predicted 
amino acid sequence of cPNA ggtl 1 f 1 (SEQ IP NOS: 5 and 6, respectively). 
Numbers above are nucleotide positions; numbers to the left are amino acid positions. 
15 Positive amino acid residue numbers correspond to the sequence of the mature 

excreted Per f I beginning with threonine. Negative sequence numbers refer to the 
signal peptide and the proenzyme regions of Denf L The arrows indicate the 
beginning of the proenzyme sequence and the mature Denf I, respectively. The 
underlined residues -81 to -78 make up the proposed cleavage site for the proenzyme 
20 formation, while the underlined residues 53-55 represent a potential N-glycosylation 
site. The termination TGA codon and the adjacent polyadenylation signal are also 
underlined. Amino acid residues 1-28 correspond to a known tryptic peptide sequence 
determined by conventional amino acid sequencing analysis. 

.;.). Figure 1 1 is a composite alignment of the amino acid sequences of the 

25 mature Per p I (SEQ IP NO: 11) and Per f I proteins. The numbering above the 
sequence refers to Per p I. The asterisk denotes the gap that was introduced for 
maximal alignment. The symbol (.) is used to indicate that the amino acid residue of 
Per f I at that position is identical to the corresponding amino acid residue of Per p I. 
The arrows indicate those residues making up the active site of Per p I and Denf I. 
30 Fi gures 12A and 12 B are a comparison of the amino acid sequence in the 

pre- and pro-peptide regions of Denf I with those of rat cathepsin H, rat cathepsin L ? 
papain, aleurain. CPL CP2, rat cathepsin B, CTLA-2, MCP, Per p I and actinidin. 
Gaps, denoted by dashes, were added for maximal alignment. Pouble asterisks denote 
conserved amino acid residues which are shared by greater than 80% of the 
35 proenzymes; single asterisks show residues which are conserved in greater than 55% 
of the sequences. The symbol (.) is used to denote semiconserved equivalent amino 
acids which are shared by greater than 90% of the proenzyme regions. 
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Figures 13 A and 13B are a hydrophilicity plot of the Per p I mature 
protein and a hydrophilicity plot of the Per f I mature protein produced using the 
Hopp-Woods algorithm computed with the Mac Vector Sequence Analysis Software 
(IBI, New Haven) using a 6 residue window. Positive values indicate relative 
5 hydrophilicity and negative values indicating relative hydrophobicity. 

Figure 14 is the nucleotide sequence and the predicted amino acid 
* sequence of Dsnf II cPNA (SEQ IP NOS: 7 and 8, respectively). Numbers to the 
right are nucleotide positions and numbers above are amino acid residues. The stop 
(TAA) signal is underlined. The first 8 nucleotides are from the oligonucleotide 
10 primer used to generate the cPNA, based on the II sequence. 

Fi gure 15 is a restriction map of Denf II cPNA, which was generated by 
computer from the sequence data. A map of Dsnp II similarly generated is shown for 
comparison. There are few common restriction enzyme sites conserved. Sites marked 
with an asterisk were introduced by cloning procedures. 
15 Fi gures 16 A, 16R. and 16C show the alignment of Per f II and Dem II 

cPNA sequences. Numbers to the right are nucleotide position and numbers above 
are amino acid residues. The top line gives Perp II nucleotide sequence and the 
second the Perp II amino acid residues. The next two lines show differences of DSLf 
II to these sequences. 

20 Fi gures 17A and 17B are hydrophilicity plots of Derf II and DgLB II 

using the Hopp-Woods algorithm computed with the Mac Vector Sequence Analysis 
Software (IBI, New Haven) using a 6-residue window. 

Figure 18 is a composite alignment of the amino acid sequences of five 
Perp I clones (a)-(e) which illustrates polymorphism in the Perp I protein (SEQ IP 

25 NO: 11). The numbering refers to the sequence of the Perp 1(a) clone. The symbol (- 
) is used to indicate that the amino acid residue of a Perp I clone is identical to the 
corresponding amino acid residue of DeLJ2 1(a) at that position. The amino acid 
sequences of these clones indicate that there may be significant variation in Pgr p I, 
with five polymorphic amino acid residues found in the five sequences. 

30 Fi gure 19 is a composite alignment of the amino acid sequences of three 

Perp II clones (c). ( 1 ) and (2) which illustrates polymorphism in the Perp II protein. 
The numbering refers to the sequence of the Perp 11(c) clone. The symbol (.) is used 
to indicate that the amino acid residue of a Perp II clone is identical to the 
correspondng amino acid residue of Per p II (c) at that position. 

35 Figure 20 is a composite alignment of the amino acid sequences of six 

Per f II clones (i.e.. pFLl, pFL2, MT3, MT5, MT18 and MT16) which illustrates 
polymorphism in the Per f II protein (SEQ IP NO: 13). The numbering 
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refers to the sequences of the UslI pFLl clone. The symbol (.) is used to indicate 
that the amino acid residue of a Per f II clone is identical to the corresponding 
amino acid residue of UslI II pFLl at that position. 

Fi gures 21 A r 21B. and 21C are the nucleotide and predicted amino 
5 acid sequences of cDNA ggtl I pl(13T) (SEQ ID NOS: 9 and 10, respectively), 
including the full length of the preproenzyme form of DfiLC I- Negative sequence 
numbers refer to the proposed pre- and preproenzyme forms of Qenp I« 

■. ;Pfifwiled Description of the Invention 
io ; The present invention relates to a nucleotide sequence coding for an 

a allergen from the house dust mite Permatophagoides and to the encoded 
Permatophagoides protein or peptide which includes at least one epitope of the 
Dermatophagoides allergen. It particularly relates to a nucleotide sequence 
capable of expression in an appropriate host of a major allergen of P. farinae, such 
15 as Per f I or Dsiif II, or of a peptide which includes at least one epitope of DfiLf I 
or of DslJII. It also particularly relates to a nucleotide sequence capable of 
expression in an appropriate host of a major allergen of P. pteronvssinus. such as 
Per p I or Perp II, or of a peptide which includes at least one epitope of DSLJ2 1 or 
of Dfitja II. The Permatophagoides nucleotide sequence is useful as a probe for 
20 identifying additional nucleotide sequences which hybridize to it and encode other 
mite allergens, particularly P. farinae or P, pteronyssmys allergens. Further, the 
present invention relates to nucleotide sequences which hybridize to a P., farinae 
protein-encoding nucleotide sequence or a P. pteronvssinus protein-encoding 
-nucleotide sequence but which encode a protein from another species or type of 
25 house dust mite, such as P. microceras (e.g., Derm I and Derm II). 

The encoded Permatophagoides mite allergen or peptide which 
includes at least one Permatophagoides (Dsnf I or Dsnfll; Per p I or USLJl II) 
epitope can be used for diagnostic purposes (e.g., as an antigen) and for therapeutic 
purposes (e.g., to desensitize an individual). Alternatively, the encoded house dust 
30 mite allergen can be a protein or peptide, such as a P , microceras protein or 
peptide, which displays the antigenicity of or is cross-reacitve with a Per for a 
Perp allergen; generally, these have a high degree of amino acid homology. 

Accordingly, the present invention also relates to compositions which 
include a Permato phagoides allergen (e.g., Per fl allergen, Deilll allergen; PSHU 
35 I or Perp II allergen or other D* allergen cross-reactive therewith) or a peptide 

which includes at least one epitope of a Permatophagoides allergen (Denf I, Denf 
II, Perp I, Perp II or other P* allergen cross-reactive therewith) individually or in 
combination, and which can be used for therapeutic applications 
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(e.g., desensitization). As is described below, DNA coding for major allergens from 
house dust mites have been isolated and sequenced. In particular, and as is described 
in greater detail in the Examples, cDNA clones coding for the Derp I, Perp II, Per f l 
and Per f II allergens have been isolated and sequenced. The nucleotide sequence of 

5 each of these clones has been compared with that of the homologous allergen from the 
related mite species (i.e., Perp I and Per f l: DSLP II and Dstfll), as has the 
predicted amino acid sequence of each. 

The following is a description of isolation and sequencing of the two 
cPNA clones coding for Per f allergens and their comparison with the corresponding 

10 P. pteronyssinus allergen and a description of use of the nucleotide sequences and 
encoded products in a diagnostic or a therapeutic context. 

Isolation and Sequence Analysis of Per f I 

A cPNA clone coding for Per f I, a major allergen from the house dust 

15 mite P. farinae , has been isolated and sequenced. A restriction map of the cPNA 
insert of the clone is represented in Figure 9, as is the strategy of PNA sequencing. 
This Denf I cPNA clone contains a 1.1-kb cPNA insert encoding a typical signal 
peptide, a proenzyme region and the mature Per f I protein. The product is 321 amino 
acid residues; a putative 18 residue signal peptide, an 80 residue proenzyme (pro- 

20 peptide) region, and a 223 residue mature enzyme region. The derived molecular 

weight is 25,191 . The nucleotide sequence and the predicted amino acid sequence of 
the Per f I cPNA are represented in Figures 10A and 10B. The deduced amino acid 
sequence shows significant homology to other cysteine proteases in the pro-region, as 
well as in the mature protein. Sequence alignment of the mature DfilJf I protein with 

25 the homologous allergen Perp I from the related mite P. pteronyssinus (Figure 1 1 ) 
revealed a high degree of homology (81%) between the two proteins, as predicted by 
previous sequencing at the protein level. In particular, the residues comprising the 
active site of these enzymes were conserved and a potential N-glycosylation site was 
present at equivalent positions in both mite allergens. 

30 Conserved cysteine residue pairs (3 1,71) and (65, 103), where the 

numbering refers to Perp I, are apparently involved in disulphide bond formation on 
the basis of the assumed similarity of the three dimensional structure of Per p I and 
Per f I to that of papain and actinidin, which also have an additional disulphide 
bridge*. The fifth and final cysteine residue for which there is a homologous cysteine 

35 residue in papain and actinidin is the active site cysteine (residue 35 in Per f I). It is 
not unlikely that the two extra cysteine residues present in Perp I and Per f I may be 
involved in forming a third disulphide bridge. 
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The potential N-glycosylation site in Derp I is also present at the 
equivalent position in Per f I. with conservation of the crucial first and last residues of 
the tripeptide site. The degree of 

of Per f I and Derp I has yet to be determined. Carbohydrates, including mannose, 

5 galactose, N-acetyl glucosamine and N-acetylgalactosamine, have been reported in 
purified preparations of these mite allergens (Chapman, M.D., J, Immunol., JL25_:587- 
592 (1980); Wolden, S. et al .. Int. Arch. Allergy ApdI. Immunol.. £&: 144-151 (1982)). 

Given the degree of homology over the first thirty N-terminal amino acid 
^residues between mature Dsld I and Derm i (70%) and mature DsLfl and D_£r_m I 

10 '(97%) with the Derm I residues determined by conventional amino acid sequencing 
(Platts-Mills TAE fiUl., In: Mite Allergy, a World-Wide Problem, 27-29 (1988); 
Lind, P. and N. Horn, In: Mite Allerpv. a World-Wide Problem. 30-34 ( 1988)), it is 
probable that the full mature P_£r_m I sequence will confirm an overall 70-80% 
homology between the Group I mite allergens. D_er_m I is an allergen from IL 

15 microceras . High homology between the proenzyme moieties of UsLJL I and Pgr f I 
(91%) over the residues -23 to -1 and the structural analysis of P_sr_f I suggests that 
the Group I allergens are likely to have N-terminal extension peptides of the mature 
protein of homologous structure and, at least for the pro-peptide, composition. 

Studies on the fine structure of the design of signal sequences have 

20 identified three structurally dissimilar regions so far: a positively charged N-terminal 
(n) region, a central hydrophobic (h) region and a more polar C-terminal (c) region 
that seems to define the cleavage site (Von Heijne, G.', EMBO J., 2:2315-2323 (1984); 
F.ur. J. Biochem. . 122:17-21 (1983); T. Mol. Biol.. 184:99-105 (1985)), Analysis of 
- the signal peptide of D_£r_f I revealed that it, too, contained these regions (Figures 12A 

25 ; -and 12B). The n-region is extremely variable in length and composition, but its net 
charge does not vary appreciably with the overall length, and has a mean value of 
. about +1 .7. The n-region of the Pjgr_f I signal peptide, with a length of two residues, 
has a net charge of +2 contributed by the initiator methionine (which is unformylated 
and hence positively charged in eukaryotes) and the adjacent lysine (Lys) residue. 

30 The h-region of Per f I is enriched with hydrophobic residues, the characteristic 

feature of this region, with only one hydrophilic residue serine (Ser) present which can 
be tolerated. The overall amino acid composition of the Dfir_f I c-region is more polar 
than that of the h-region as is found in signal sequences with the h/c boundary located 
between residues -6 and -5, which is its mean position in eukaryotes. Thus, the P_erf I 

35 pre-peptide sequence appears to fulfill the requirements to which a functional signal 
sequence must conform. 
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While the signal sequence of Per f I and other cysteine proteases share 
structural homology, all being composed of the n,h and c-regions, they are highly 
variable with respect to overall length and amino acid sequence, as is clear in Figures 
12A and 12B. However, significant sequence homology has been shown between the 

5 pro-regions of cysteine protease precursors (Ishidoh, K. £LaL, FEBS Letters, 22fi:33- 
37 (1987)). Alignment of the proenzyme regions of DfiLf I and a number of other 
cysteine proteases (Figures 12A and 12B) indicated that these proregions share a 
number of very conserved residues as well as semi-conserved residues which were 
present in over half of the sequences. This homology was increased if conservative 

10 amino acids such as valine (Val), isoleucine (lie) and leucine (Leu) (small 

hydrophobic residues) or arginine (Arg) and Lys (positively charged residues) were 
regarded as identical. The Per f l proregion possessed six out of seven highly 
conserved amino acids and all the residues at sites of conservative changes. The 
homology at less conserved sites was lower. Homology in the pro-peptide, in 

15 particular the highly conserved residues, may be important when considering the 

function of the pro-peptide in the processing of these enzymes, since it indicates that 
these sequences probably have structural and functional similarities. 

Highly cross-reactive B cell epitopes on DfiEJf I and Derp I have been 
demonstrated using antibodies present in mouse, rabbit and human sera (Heymann, 

20 P.W. etal. . J. Immunol. 132:2841-2847 (1986); Platts-Mills, TAE fiLaL, 

J. Allergy Clin. Immunol. 2&398-407 (1986)). However, species-specific epitopes 
have also been defined in these systems. Murine monoclonal antibodies bound 
predominantly to species-specific determinants (Platts-Mills TAE eLaL, 
J. Allergy Clin. Immunol. 132:1479-1484 (1987)). Some 40% of rabbit anti-D£L£ I 

25 reactivity was accounted for by epitopes unique to Derp I (Platts-Mills, TAE et ah, L 
Allergy Clin. Immunol. 78:398-407 (1986)), and some species-specific binding of 
antibodies from allergic humans was observed, although the majority hind to cross- 
reactive epitopes (Platts-Mills TAE et al. T J. Immunol. 139:1479-1484 (1987)). 

The recombinant DNA strategy of gene fragmentation and expression 

30 was used (Greene, W.K. et al. . Immunol. (1990)) to define five antigenic regions of 
recombinant Derp I which contained B cell epitopes recognized by a rabbit anti-Deng 
I antiserum. Using the technique of immunoabsorption, three of these putative 
epitopes were shown to be shared with Dsnf I (located on regions containing amino 
acid residues 34-47, 60-72 and 166-194) while two appeared to be specific for Derp I 

35 (regions 82-99 and 1 12-140). Differences in the reactivity of these peptides to rabbit 
anti- D. farinae supported the above division into cross-reactive and species-specific 
epitopes. The sequence differences shown between 
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the Per p I and the Per f I proteins are primarily located in the N and C terminal 
regions, as well as in an extended surface loop (residues 85-136) linking the two 
domains of the enzyme that includes helix P (residues 127-136), as predicted from the 
secondary and tertiary structures of papain and actinidin (Baker, E.N. and J. Prenth, 

5 In: Biological Macromolec nles and Assemblies. Vol. 3, pp. 314-368, John Wiley and 
Sons, NY (1987)). The surface location of these residues is supported by the 
hydrophilicity plots of Per p I and Per f I in Figures 1 3 A and 1 SB, which illustrate the 
predominantly hydrophilic nature of this region that predicts surface exposure. This 
-region also contains the two species-specific B cell epitopes recognized by the rabbit 

10 --i anti- Per p I serum (see above). Analysis of the sequences in the regions containing 
v^the cross-reactive epitopes (located in regions 34-47 and 60-72) are completely 
conserved between Perp I and Per f I, while the majority of residues in a third cross- 
reactive epitope-containing region (residues region 166-194) were conserved. 

Expression of cPNA encoding Denf I results in production of pre- 

15 pro-Derf 1 protein in IL colL a recombinant protein of greater solubility, stability 
and antigenicity than that of recombinant Per p I Protein encoded by Per f I 
cPNA has been expressed using a pGEX vector and has been shown by 
radioimmune assay to react with rabbit anti- P. farinae antibodies. The availability 
of high yields of soluble Per f I allergen and antigenic derivatives will facilitate 

20 the development of diagnostic and therapeutic agents and the mapping of B and T 
cell antigenic determinants. 

With the availability of the complete amino acid sequence of 
recombinant DenfL mapping of the epitopes recognized by both the B and T cell 
: compartments of the immune system can be carried out. The use of techniques 

25 such as the screening of overlapping synthetic peptides, the use of monoclonal 
antibodies and gene fragmentation and expression should enable the identification 
of both the continuous and topographical epitopes of Dfinf L It will be 
particularly useful to determine whether allergenic (IgE-binding) determinants 
have common features and are intrinsically different from antigenic (IgG-binding) 

30 determinants and whether T cells recognize unique epitopes different from those 
recognized by B cells. Studies to identify the Dsr_f I epitopes reactive with mite 
allergic human IgE antibodies and the division of these into determinants cross- 
reactive with Perp I and determinants unique to Dfirf I can also be carried out. B 
cell (and T cell) epitopes specific for either species can be used to provide useful 

35 diagnostic reagents for determining reactivity to the different mite species, while 
cross-reacting epitopes are candidates for a common immunotherapeutic agent. 
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As described in detail in the Examples, a cDNA clone coding for 
Per p I which contained a 0.8-kb cDNA insert has been isolated. Sequence 
analysis revealed that the 222 amino acid residue mature recombinant Per p I 
protein showed significant homology with a group of cysteine proteases, including 
5 actinidin, papain, cathepsin H and cathepsin B. 

Isolation and Sequen ce Analysis of Per f II 

A rT)NA clone coding for Per f II. a major allergen from the house 
dustniite P. farinae . has been isolated and sequenced, as described in the 

10 Examples. The nucleotide sequence and the predicted amino acid sequence of the 
Per f II cPNA are represented in Figure 14. A restriction map of the cPNA insert 
of a clone coding for Per f II is represented in Figure 15. 

Figures 16A, 16B, and 16C show the alignment of Dsr_f II and Perp 
II cPNA sequences. The homology of the sequence of DfiLf H with Per p II 

15 (88%) is higher than the 81% homology found with Perp I and -DfiLf I. which is 
significantly different (p<0.05) using the chi 2 distribution. The reason for this 
may simply be that the Group I allergens are larger and each residue may be less 
critical for the structure and function of the molecule. It is known, for example, 
that assuming they adopt a similar conformation to other cysteine proteases, many 

20 of the amino acid differences in Perp I and Denf I lie in residues linking the two 
domain structures of the molecules. The 6 cysteine molecules are conserved 
between the group II allergens, suggesting a similar disulphide bonding, although 
this may be expected, given the high overall homology. Another indication of the 
conservation of these proteins is that 34/55 of the nucleotide changes of the 

25 Coding sequence are in the third base of a codon, which usually does not change 
the amino acid. Residues that may be of importance in the function of the 
molecule are Ser 57 where all three bases are changed but the amino acid is 
conserved. A similar phenomenon exists at residue 88, where a complete codon 
change has conserved a small aliphatic residue. Again, like Perp n, the Dsn! II 

30 cPNA clone does not have a poly A tail, although the 3' non-coding region is rich 
in adenosine and has two possible polyadenylation signals ATAA. The 
nucleotides encoding the first four residues are from the PCR primer which was 
designed from the known homology of Perp II and Pgr, f II from N-terminal 
amino acid sequencing. A primer based on the C-terminal sequence can now be 

35 used to determine these bases, as well as the signal sequence. 
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Uses of the subject allergenic protein s/peptides and DNA encoding same 

The materials resulting from the work described herein, as well as 
compositions containing these materials, can be used in methods of diagnosing, 
treating and preventing allergic responses to mite allergens, particularly to mites 

5 nf the. genus Permatopha goides . such as P. faring and £>, pteronyssinus. In 
addition, the cDNA (or the mRNA from which it was transcribed) can be used to 
identify other similar sequences. This can be carried out, for example, under 
conditions of low stringency and those sequences having sufficient homology 
(generally greater than 40%) can be selected for further assessment using the 

10 onethod described herein. Alternatively, high stringency conditions can be used. 
In this manner, DNA of the present invention can be used to identify sequences 
rcoding for mite allergens having amino acid sequences similar to that of Dsnf I, 
DeLf II, Derp I or Per p IL Thus, the present invention includes not only 
D. farinae and F) pteronvssinus allergens, but other mite allergens as well (e.g., 

15 other mite allergens encoded by DNA which hybridizes to DNA of the present 
invention). 

Proteins or peptides encoded by the cDNA of the present invention 
can be used, for example, as "purified" allergens. Such purified allergens are 
useful in the standardization of allergen extracts or preparations which can be 

20 used as reagents for the diagnosis and treatment of allergy to house dust mites. 
Through use of the peptides of the present invention, allergen preparations of 
consistent, well-defined composition and biological activity can be made and 
administered for therapeutic purposes (e.g., to modify the allergic response of a 
house dust mite-sensitive individual). Deri I or Dsnf H peptides or proteins (or 

25 modified versions thereof, such as are described below) may, for example, modify 
B-cell response to Derf l or Dsrf II, T-cell response to Dsnf I and DfiLf II, or 
both responses. Similarly, Derp I or DSL£ H proteins or peptides may be used to 
modify B-cell and/or T-cell response to Derp I or Derp II. Purified allergens can 
also be used to study the mechanism of immunotherapy of allergy to house dust 

30 mites, particularly to Per f L Per f IL Derp I and Derp II, and to design modified 
derivatives or analogues which are more useful in immunotherapy than are the 
unmodified ("naturally-occurring") peptides. 

In those instances in which there are epitopes which are cross- 
reactive, such as the three epitopes described herein which are shared by Per f I 

35 and Derp I, the area(s) of the molecule which contain the cross-reactive epitopes 
can be used as common immunotherapeutic peptides to be administered in treating 
allergy to the two (or more) mite species which share the epitope. For example, 
the cross-reactive epitopes could be used to induce IgG blocking antibody against 
both allergens (e.g., Dsnf I and Derp I allergen). A peptide containing a 
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univalent antibody epitope can be used, rather than the entire molecule, and may 
prove advantageous because the univalent antibody epitope cannot crosslink mast 
cells and cause adverse reactions during desensitizing treatments. It is also 
possible to attach a B cell epitope to a carrier molecule to direct T cell control of 

5 allergic responses. 

Alternatively, it may be desirable or necessary to have peptides which 
% are specific to a selected Dermatophagoides allergen. As described herein, two 
epitopes which are apparently DeLB I-specific have been identified. A similar 
approach can be used to identify other species-specific epitopes (e.g., DsLS I or II, 

10 Derf I or II). The presence in an individual of antibodies to the species-specific 
epitopes can be used as a quick serological test to determine which mite species is 
causing the allergic response. This would make it possible to specifically target 
therapy provided to an individual to the causative species and, thus, enhance the 
therapeutic effect. - 

15 Work by others has shown that high doses of allergens generally 

produce the best results (i.e., best symptom relief). However, many people are 
unable to tolerate large doses of allergens because of allergic reactions to the 
allergens. Modification of naturally-occurring allergens can be designed in such a 
manner that modified peptides or modified allergens which have the same or 

20 enhanced therapeutic properties as the corresponding naturally-occiirring allergen 
but have reduced side effects (especially anaphylactic reactions) can be produced. 
These can be, for example, a peptide of the present invention (e.g., one having all 
or a portion of the amino acid sequence of DSLf I or Dsnf II, Perp I or Pgr p II). 
Alternatively, a combination of peptides can be administered. A modified peptide 

25 or peptide analogue (e.g., a peptide in which the amino acid sequence has been 
altered to modify immunogenicity and/or reduce allergenicity or to which a 
component has been added for the same purpose) can be used for desensitization 
therapy. 

Administration of the peptides of the present invention to an 
30 individual to be desensitized can be carried out using known techniques. A 

peptide or combination of different peptides can be administered to an individual 
in a composition which includes, for example, an appropriate buffer, a carrier 
and/or an adjuvant. Such compositions will generally be administered by 
injection, inhalation, transdermal application orrectal administration. Using the 
35 information now available, it is possible to design a Derp I, Pffp II, DsLf I or 
Derf II peptide which, when administered to a sensitive individual in sufficient 
quantities, will modify the individual's allergic response to Derp I, Perp II, Per f 
I and/or Per f II. This can be done, for example, by examining the structures of 
these allergens, producing peptides to be examined for their ability to influence B- 
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cell and/or T-cell responses in house dust mite-sensitive individuals and selecting 
appropriate epitopes recognized by the cells. Synthetic amino acid sequences 
which mimic those of the epitopes and which are capable of down regulating 
allergic response to Per p I, Per p II, Per f I or Per f II allergens can be made. 
5 Proteins, peptides or antibodies of the present invention can also be used, in 

known methods, for detecting and diagnosing allergic response to Per f I or Per f 
IL For example, this can be done by combining blood obtained from an 
individual to be assessed for sensitivity to one of these allergens with an isolated 
allergenic peptide of house dust mite, under conditions appropriate for binding of 

10 ^or stimulating components (e.g., antibodies, T cells, B cells) in the blood with the 
"peptide and determining the extent to which such binding occurs. Deri* and Per p 
^proteins or peptides can be administered together to treat an individual sensitive to 
both allergen types. 

It is now also possible to design an agent or a drug capable of 

15 blocking or inhibiting the ability of D£L£ -I, Per p U, DfiLf I or Per f II to induce 
an allergic reaction in house dust mite-sensitive individuals. Such agents could be 
designed, for example, in such a manner that they -would bind to relevant anti- 
Per p I, anti- Per p II, anti-Dsnf I or anti-Dsnf II IgEs, thus preventing IgE- 
allergen binding and subsequent mast cell degranulation. Alternatively, such 

20 agents could bind to cellular components of the immune system, resulting in 

suppression or desensitization of the allergic response to these allergens. A non- 
restrictive example of this is the use of appropriate B-, and T-cell epitope peptides, 
or modifications thereof, based on the cPNA/protein structures of the present 
invention to suppress the allergic response to these allergens. This can be carried 

25 out by defining the structures of B- and T-cell epitope peptides which affect B- 
and T-cell function in in vitro studies with blood cells from house dust mite- 
sensitive individuals. 

The cPNA encoding Per p I, Per p II, Per f I or DsLf II or a peptide 
including at least one epitope thereof can be used to produce additional peptides, 

30 using known techniques such as gene cloning. A method of producing a protein 
or a peptide of the present invention can include, for example, culturing a host cell 
containing an expression vector which, in turn, contains PNA encoding all or a 
portion of a selected allergenic protein or peptide (e.g., Perp h Pcrp II, Denf I, 
Per f II or a peptide including at least one epitope). Cells are cultured under 

35 conditions appropriate for expression of the PNA insert (production of the 
encoded protein or peptide). The expressed product is then recovered, using 
known techniques. Alternatively, the allergen or portion thereof can be 
synthesized using known mechanical or chemical techniques. As used herein, the 
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term protein or peptide refers to proteins or peptides made by any of these 
techniques. The resulting peptide can, in turn, be used as described previously. 

DNA to be used in any embodiment of this invention can be cDNA 
obtained as described herein or, alternatively, can be any oiigodeoxynucleotide 

5 sequence having all or a portion of the sequence represented in Figures 1A and IB, 
7 A and 7B, 10A and 10B, and 14 or their functional equivalent. Such 
oiigodeoxynucleotide sequences can be produced chemically or mechanically, 
using known techniques. A functional equivalent of an oligonucleotide sequence 
is one which is capable of hybridizing to a complementary oligonucleotide 

10 sequence to which the sequence (or corresponding sequence portions) of Figures 
1A and IB, 7 A and 7B, 10A and 10B, and 14 hybridizes and/or which encodes a 
product (e.g., a polypeptide or peptide) having the same functional characteristics 
of the product encoded by the sequence (or corresponding sequence portion) 
represented in these figures. Whether a functional equivalent must meet one or 

15 both criteria will depend on its use (e.g., if it is to be used only as an oligoprobe, it 
need meet only the first criterion and if it is to be used to produce house dust mite 
allergen, it need only meet the second criterion). 

The structural information now available (e.g., DNA, protein/peptide 
sequences) can also be used to identify or define T cell epitope peptides and/or B 

20 cell epitope peptides which are of importance in allergic reactions to house dust 
mite allergens and to elucidate the mediators or mechanisms (e.g., interleukin-2, 
interleukin-4, gamma interferon) by which these reactions occur. This knowledge 
should make it possible to design peptide-based house dust mite therapeutic agents 
or drugs which can be used to modulate these responses. 

25 The present invention will now be further illustrated by the following 

Examples, which are not intended to be limiting in any way. 

EXA MPLE 1 

MATERIALS AND METHODS 

30 Cloning and Expression of Per P I cDNA. 

Polyadenylated mRNA was isolated from the mite 
Dermatophagoides pteronvssinus cultured by Cojnmonwealth Serum Laboratories, 
Parkville, Australia, and cDNA was synthesized by the RNA-ase H method (5) 
using a kit (Amersham, International, Bucks). After the addition of EcoRI linkers 

35 the cDNA was ligated into ggtl 1 and plated in IL cali Y1090 (r-) (Promega Biotec, 
Madison, Wisconsin), to produce a library of 5x10^ recombinants. Screening was 
performed by plaque radioimmune assay (6) using a rabbit anti-DfiHC I antiserum 
(7). Reactivity was detected by hydrochloride in 0.1 
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M sodium acetate buffer pH 5.2 were then added and the mixture was 
homogenized and spun at 10,000 rpm for 30 min in a Sorval SS34 rotor. The 
supernatant was collected and layered onto a CsCl pad (5ml of 4.8 M CsCl in 10 
mM EDTA) and centrifuged at 37,000 rpm for 16h at 15°C in a SW41 TI rotor 
5 (Beckman Instruments, Inc., Fullerton, CA). The DNA band at the interphase was 
collected and diluted 1:15 in lOmM Tris HC1/1 mM EDTA buffer, pH 8.0. 
Banding of genomic DNA in CsCl was carried out by the standard method. 

Isolation of D NA from ggtl 1 pi cDNA Clone, 

10 — — Phage DNA from ggt 1 1 p 1 clone was prepared by a rapid isolation 

jprocedure. Clarified phage plate lysate (1 ml) was mixed with 270ml of 25% 
*wt/vol polyethylene glycol (PEG 6000) in 2,5 M NaCl and incubated at room 
temperature for 1 5 min. The mixture was then spun for 5 min in a microfiige 
(Eppendorf, Federal Republic of Germany), and the supernatant was removed. 

15 The pellet was dissolved in 100 ml of 10 mM Tris/HCl pH 8.0 containing 1 mM 
EDTA and 100 mM NaCl. This DNA preparation was extracted 3 times with 
phenol/chloroform (1:1) and the DNA was precipitated by ethanol. v 

DNA Hybridization . 

20 Nucleic acid was radiolabeled with 32 P by nick translation (10). 

DNA samples were digested with appropriate restriction enzymes using 
conditions recommended by the supplier. Southern blots were prepared using 
Zeta-Probe membranes (Bio-Rad Laboratories, Richmond, CA). 
Prehybridization, hybridization, posthybridization washes were carried out 

25 according to the manufacturers recommendations (bulletin 1234, Bio-Rad 
Laboratories). 

Cloning and DNA Seguencing 

To clone the 0.8-kb cDNA insert from clone ggtll pi into plasmid 
30 pUC8, phage DNA was digested with EcoRI restriction enzyme and then ligated 
to EcoRI-digested pUC8 DNA and used to transform Escherichia coli JM83 . The 
resulting recombinant plasmid was designated as pHDM 1 . 

To obtain clones for DNA sequence analysis, the cDNA insert was 
isolated from pHDM 1 and ligated to M13-derived sequencing vectors mpl8 and 
35 mpl 9(16). Transformation was carried out using IL CQli JM107 and sequencing 
was performed by the dideoxynucleotide chain termination method (11). 
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RESULTS 

Several phage clones reacted with the rabbit anti Per p I serum and 
hybridized with all 3 oligonucleotide probes. One of these, ggtl 1 pl(13T), was 
examined further. The nucleotide sequence of the cDNA insert from this clone, 

5 ggtl 1 pi, was determined using the sequencing strategy shown in Fig. 2. The 

complete sequence was shown to be 857 bases long and included a 69-base-long 5' 
proximal end sequence, a coding region for the entire native Derp I protein of 222 
amino acids with a derived molecular weight of 25,37 1 , an 89-base-long 3' 
noncoding region and a poly (A) tail of 33 residues (Figures 1 A and IB). 

10 The assignment of a threonine residue at position 1 as the NH2- 

terminal amino acid of Derp I was based on data obtained by NH2-terminal amino 
acid sequencing of the pure protein isolated from mite excretions (17), The 
predicted amino acid sequence matched with data obtained by amino acid sequence 
analysis of the NH2-terminal region as well as with internal sequences derived from 

15 analyses of tryptic peptides (Figures 1 A and IB), The complete mature protein is 
coded by a single open reading frame terminating at the TAA stop codon at 
nucleotide position 736-738. At present, it is not certain whether the first ATG 
codon at nucleotide position 16-18 is the translation initiation site, since the 
immediate flanking sequence of this ATG codon (TTGATGA) showed no 

20 homology with the Kozak consenses sequence (ACCATGG) for the eukaryotic 
translation initiation sites (18). In addition, the 5' proximal end sequence does not 
code for a typical signal peptide sequence (see below). 

The amino acid sequence predicted by nucleotide analysis is shown in 
Figures 1 A and IB. A protein data-base search revealed that the Derp I amino 

25 acid sequence showed homology with a group of cysteine proteases. Previous 
cDNA studies have shown that lysosomal cathepsins B, a mouse macrophage 
protease and a cysteine protease from an amoeba have transient pre- and proform 
intermediates (19-21), and inspection of the amino acid sequence at the 5' proximal 
end of the ggtl 1 pi cDNA clone suggests that Derp I may be similar. First, the 

30 hydrophilicity plot (22) of the sequence preceding the mature protein sequence 
lacks the characteristic hydrophobic region of a signal peptide (23) and second, an 
Ala-X-Ala sequence, the most frequent sequence preceding the signal peptidase 
cleavage site (24,25), is present at positions -13; -14, -15 (Figures 1A and IB). 
Therefore, it is proposed that cleavage between pro-DfiUi 1 sequence and the pre- 

35 Derp I sequence occurs between Ala (-13) and Phe (-12). Thus, pro-DsLP I 
sequence begins at residues Phe (-12) and ends at residues Glu (-1). The amino 
acids residues numbered -13 to -23 would then correspond to a partial signal 
peptide sequence. The full length of the Derp I preproenzyme sequence has been 
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determined and is shown in Figures 21 A and 21B. The negative sequence 
numbers refer to the pre- and preproenzyme forms of Per p I. 

When the 857-bp cDNA insert was radiolabeled and hybridized 
against a Southern blot of EcoRI-digested genomic DNA from house dust mite, 
5 hybridization to bands of 1.5, 0.5, and 0.35 kb was observed (data not shown). As 
shown in the restriction enzyme map of the cDNA insert (Figure 2), there was no 
* internal EcoRI site and the multiple hybridization bands observed suggest that 
Derp 1 is coded by a noncontiguous gene. The results also showed little evidence 
, of -gene duplication since hybridization was restricted to fragments with a total 

10 length of 2.4 kb. 

The N-terminal can be compared with N-terminal of the equivalent 
protein from D.farinae (Derf D (12). There is identity in 1 1/20 positions of the 
sequences available for comparison (Fig. 3). 

To examine the protein produced by ggt 1 1 p 1 ( 1 3T), phage was 

15 lysogenized in Y 1 089 (r-) and the bacteria grown in broth culture at 30°C. Phage 
was induced by temperature switch and isopropyl thiogalactopyranoside (IPTG) 
(6) and the bacteria were suspended in PBS to 1/20 of the culture volume, and 
sonicated for an antigen preparation. When examined by 7.5% SDS-PAGE 
electrophoresis it was found that ggtl 1 pl(13T) did not produce a Mr 1 16K B- 

20 galactosidase band but instead produced a 1 40K, band consistent with a fusion 

protein with the Derp I contributing a 24kDa moiety (6). Rabbit anti Dsrji I was 
shown to react with the lysate from ggtl 1 pl(13T) (Fig. 4). 

EXAMPLE 2 

25 

Expression of Per p I cP NA products reactive with IeE from allergic Serum. 
The DNA insert from ggtl 1 p 1 ( 1 3T) which codes for DfiLP I was 
subcloned into the EcoRI site of the plasmid expression vector (pGEX)(26) where 
it could be expressed as a fusion with a glutathione transferase molecule. E, £Qli 

30 infected with this plasmid pGEX-pl(13T) or with the vector alone were grown to a 
log phase culture and harvested by centrifugation. The bacteria were suspended in 
PBS to 1/20 of their culture volume and lysed by freeze- thawing. The lysate was 
shown by sodium dodecyl- sulphate polyacrylamide electrophoresis to express a 
fusion protein in high concentration of the expected Mr 50,000. These lysates 

35 were then tested for their ability to react with IgE from allergic serum by 

radioimmune dot-blot conducted by the method described by Thomas and Rossi 
(27). The serum was taken from donors known to be mite-allergic or from non- 
allergic controls. Reactivity was developed by 125i- m0 noclonal anti-IgE and 
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autoradiography. Figure 5 shows the lysate from pGEX-pl(13T), but not the 
vector control reacted with IgE in allergic serum, but not non allergic serum. 

EXAMPLE 3 

5 

Inhibition of IgE antibody responses to Per P I bv 

treatment with the product from a cDNA clone 
coding for Per pi. 
Ei £Oli lysogenized by ggtl 1 pl(13T) were grown and induced by 

10 temperature switch to produce a recombinant fusion protein which was consistent 
with a 24 kD Derp I moiety and a 1 16 kD B-galactosidase moiety (pl(13T) (28). 
This protein was mostly insoluble and could be isolated to about 90% purity, 
judged by sodium didodecyl polyacrylamide electrophoresis, by differential 
centrifugation. A similar protein was produced from another gtl 1 cDNA mite 

15 clone ggt pX (2c). To test for the ability of the recombinant protein to modify IgE 
antibody responses to Derp I, groups of 4-5 CBA mice were injected 
intraperitoneally with 2 mg of the pl(13T) or pX (2c) fusion proteins and after 2 
days given a subcutaneous injection of 5mg of native Derp I (from mite culture 
medium) in aluminium hydroxide gel. The IgE aintibody titres were measured by 

20 passive cutaneous anaphylaxis (PCA) after 3 and 6 weeks. The methods and 
background data for these responses have been described by Stewart and Holt 
(29). For a specificity control, groups of mice injected with pl(13T) or pX (2c) 
were also injected with lOmg of ovalbumin in alum. Responses were compared to 
mice without prior pl(13T) or pX (2c) treatment (Table 1). After 3 weeks mice 

25 either not given an injection of recombinant protein or injected with the control 
pX (2c) had detectable anti DsLJ2 1 PCA titres (1/2 or greater). Only 1/5 of mice 
treated with recombinant pl(13T) had a detectable titre and this at 1/4 was lower 
than all of the titres of both control groups. Titres of all groups at 6 weeks were 
low or absent (not shown). The PCA response to ovalbumin was not significantly 

30 affected by treatment with recombinant proteins. These data show the potential of 
the recombinant proteins to specifically decrease IgE responses as required for a 
desensitizing agent. 
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TABLE 1 Inhibition of anti-E>£LP I IgE by preinjection with with recombinant 
Per p I. 



preinjection immunizing IgE (PCA) titres at d21 
5 group -2 days injection (dO) 

(5mg/alum) responders titres 

I ~ Per pi ~4/4 "1/16-1/64 
2_pX(2C) Per p I 5/5 1/8-1/16 
10^3 pl(13T) DSLpI 1/5* 1/4* 

^4 ovalbumin 4/4 1/64-1/256 

5 pX(2C) ovalbumin 5/5 1/32-1/128 

6 pl(13T) ovalbumin 5/5 1/64-1/256 . 

15 • _ ; . 

Mice were given a preinjection on day -2 and then immunized with Derp I or 
ovalbumin on day 0. Serum antibody titres were measured on day 21 and 42 by 
PCA in rat skin. Significant anti-DfiL£ I titres were not detected on day 42 (not 
20 shown). The PCA were measured to Per p I for groups 1-3 and ovalbumin for 
groups 4-6. The anti-DSLJl I titres were lower (p<0.001)* when pretreated with 
recombinant Per p I p 1 ( 1 3T). 

♦Mann Whitney analysis. 

25 

EXAMPLE 4 

Expression of Per p I antigenic determinants bv 
fragments of the cPNA from ggtl 1 pK13D 

30 The cPNA from ggtl 1 (13T) coding for P<?rp I was fragmented by 

sonication. The fragments (in varying size ranges) were isolated by 
electrophoresis, filled in by the Klenow reaction to create blunt ends. EqqRI 
linkers were attached and the fragment libraries cloned in ggtl 1. The methods 
used for the fragments cloning were the same as that used for cPNA cloning (6). 

35 Plaque immunoassay was used for screening with rabbit anti-Denji I- Three phage 
clones reacting with the antiserum were isolated and the oligonucleotide sequences 
of the cloned fragments obtained. Two of these were found to code for Denp I 
amino acids 1 7-55 (see Figures 1 A and IB for numbering) and one for amino acids 
70-100. Such fragments will eventually be useful for both diagnostic reagents to 
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determine epitope reactivity and for therapy where molecules of limited 
allergenicity may increase safety of desensitisation. 

EXAMPLES 

5 

Cloning and expression of cD NA coding for the m^jor mite allergen Per p II. 

The Dermatophagoides pteronyssinus cDNA library in ggtl 1 
previously described was screened by plaque radioimmune assay using 
nitrocellulose lifts (6). Instead of using specific antisera the sera used was from a 

10 person allergic to house dust mites. The serum (at 1/2 dilution) was absorbed with 
E. coli . To detect reactivity an 125i labelled monoclonal anti-IgE was used (at 
30ng/ml with 2xl0 6 cpm/ml (approx. 30% counting efficiency)). After 1 hour the 
filters were washed and autoradiography performed. Using this procedure 4 
clones reacting with human IgE were isolated. It was found they were related by 

15 DNA hybridization and had an identical pattern of reactivity against a panel of 
allergic sera. Fig. 6 shows IgE reactivity in plaque radioimmunoassay against 
allergic serum (AM) (top row) or non allergic (WT). Here, clones 1, 3 and 8 react 
strongly, but only against allergic sera. The amp 1 segments (present in row 1) 
are a ggtl 1 vector control. The bottom row is an immunoassay with rabbit anti- 

20 Derp I, developed by 125 I staphylococcus protein A which shows no significant 
reactivity. The clones were tested against a panel of sera. Serum from five 
patients without allergy to mite did not react, but serum from 14/17 people with 
mite allergy showed reactivity. The DNA insert from the clone ggtl 1 pII(CI) was 
subcloned into M13 mpl8 and M13 mpl9 and sequenced by the chain termination 

25 method. The nucleotide sequence (Figures 7A and 7B) showed this allergen was 
Derp II by (a) the homology of the inferred amino acid sequence of residues 1-40 
with that of the N-terminal amino acid of Dslj* II (30); and (b) the homology of 
this sequence with the equivalent Per f II allergen from Pgrmatophagoides farinae 
(30). 

30 

EXAMPLE 6 

Isolation and Character ! ration of cDNA Coding for Per f I 
MATFKTAT.S AND METHODS 
35 Dermatoph agoides farinae culture 

Mites were purchased from Commonwealth Serum Laboratories, 
Parkville, Australia. 
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Construction of the D. farin ae cDNA ggtl 1 library 

Polyadenylated mRNA was isolated from live D. farinae mites and 
cDNA was synthesized by the RNase H method (Gubler, V. and B.J. Hoffman, 
Gene 2^:263-269 (1983)) using a kit (Amersham International, Bucks.). After the 
5 addition of EcoRI linkers (New England Biolabs, Beverly, MA) the cDNA was 
ligated to alkaline phosphatase treated ggtl 1 arms (Promega, Madison, WI). The 
ligated DNA was packaged and plated in IL_£Qli Y1090 (r-) to produce a library of 
2x1 0^ recombinants. 

10 Isolation of Per f I cDNA clones fr om the D. farinae cDNA ggtl 1 library 

Screening of the library was performed by hybridization with two 
probes comprising the two Derp I cDNA BamHI fragments 1-348 arid 349-857 
generated by BamHI digestion of a derivative of the Der p I cDNA which has had 
two BamHI restriction sites inserted between amino acid residues -1 and 1 and 

15 between residues 1 16 and 1 17 by site-directed mutagenesis (Chua, K;Y. fiLaL, 
J. Exp. Med. 162:175-182 (1988)), The probes were radiolabelled with 32 P by 
nick translation. Phage were plated at 20,000 pfu per 150mm petri dish and 
plaques were lifted onto nitrocellulose (Schleicher and Schull, Dassel, FRG), 
denatured and baked (Maniatis, T. et al. . Molecular C loning: Laboratory anual. 

20 Cold Spring Harbor Laboratory Press (1982)). Prehybridizations were performed 
for 2 hours at 42°C in 50% formamide/5 x SSCE/1 x Denhardfs/poly C 
(0.1mg/ml)/poly U(0.1mg/ml) with hybridization overnight at 42°C at 10 6 
cpm/mL Post hybridization washes consisted of 15 min washes at room 
temperature with 2 x sodium chloride citrate (SSC)/0.1% sodium dodecylsulphate 

25 :(SDS), 0.5 x SSC/0.1% SDS, 0.1 x SSC/0.1% SDS successively and a final wash 
at 50°C for 30 min in 0.1 x SSC/1% SDS. 

Isolation ofPNA from ggtl 1 f 1 cPNA clones * 

Phage DNA from ggtl 1 f 1 clones was prepared by a rapid isolation 

30 procedure. Clarified phage plate lysate (1 ml) was mixed with 270 of 25% wt/vol 
polyethylene glycol (PEG 6000) in 2.5M NaCl and incubated at room temperature 
for 15 min. The mixture was then spun for 5 min in a microfuge (Eppendorf, 
FRG), and the supernatant was removed. The pellet was dissolved in 100 mL of 
lOmM Tris/HCl pH8.0 containing 1 mM EDTA and 100 mM NaCl (TE). This 

35 DNA preparation was extracted with phenol/TE, the phenol phase was washed 
with 100 mP TE, the pooled aqueous phases were then extracted another 2 times 
with phenol/TE, 2 times with Leder phenol (phenol/chloroform/isoamylalcohol; 
25 :24: 1), once with chloroform and the DNA was precipitated by ethanol. 
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PNA sequencing 

To obtain clones for DNA sequence analysis, the ggtl 1 fl phage 
DNA was digested with EcoRI restriction enzyme (Pharmacia, Uppsala, Sweden) 
and the DNA insert was ligated to EcoRI-digested M13-derived sequencing 

5 vectors mp 1 8 and mp 1 9 (Maniatis, T. et al. . Molecular Clonine: A 
laboratory Manual . Cold Spring Harbor Laboratory Press (1982)). 
Transformation was carried out using E. coli TG-1 and sequencing was performed 
by the dideoxynucleotide chain termination method (Sanger, F. £LaL, 
Pmc. Natl. Acad. Sci. USA. 74:5463-5467 (1977)) using the Sequenase version 

10 2.0 DNA sequencing kit (U.S.B., Cleveland, Ohio); ^ . , ; 

Polymerase chain reaction CPCfi) 

PCR was performed by the Taq DNA polymerase method (Saiki, 
R K et al.. Science 239 :487-491 (1988)) using the TaqPaq kit (Biotech 
15 International, Bentley, WA) and the conditions recommended by the supplier with 
1 Ong of target DNA and 1 Opmol of ggtl 1 primers (New England BioLabs, 
Beverly, MA). 

RESULTS 

20 Isolation of Per f I cDN A clones 

Two clones expressing the major mite allergen Dsnf I were isolated 
from the n farinae cDNA ggtl 1 library by their ability to hybridize with both of 
the DfiLP I cDNA probes (nucleotides 1-348 and 349-857). This approach was 
adopted because amino acid sequencing had shown high homology (80%) 

25 between these two allergens (Thomas, W.R., fiLaL, Advances in the Biosciences. 
14: 139-147 (1989)). Digestion of the ggtl 1 fl clone DNA with EcoRI restriction 
enzyme to release the cDNA insert produced three Dsr_f I cDNA EcoRI 
fragments: one approximately 800 bases long and a doublet approximately 150 
bases long. The Per f I cDNA insert was also amplified from the phage DNA by 

30 the polymerase chain reaction (PCR) resulting in a PCR product of approximately 
1 .1-kb. Each Per f l cDNA fragment was cloned separately into the M13-derived 
sequencing vectors mpl8 and mpl9 and sequenced. 

PNA sequence analysis 
35 The nucleotide sequence of Per f I cDNA was determined using the 

sequencing strategy shown in Figure 9. The complete sequence was shown to be 
1084 bases long and included a 335-base long 5* proximal end sequence, a coding 
region for the entire native Per f l protein of 223 amino acids with a derived 
molecular weight of 25,191 and an 80-base long 3' noncoding region (Fig. 10). 
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The assignment of the threonine residue at position 1 as the NH2-terminal amino acid 
of Per f I was based on data obtained by NH2-terminal amino acid sequencing of the 
native protein and the predicted amino acid sequence of recombinant Per p I (Chua, 
K.Y. et al„ J. Exp. Med. , 162:175-182 (1988)). The predicted amino acid sequence of 

5 the Per f I cPNA in the NH2-terminal region matched completely with that 
determined at the protein level (Figures 10A and 10B). 

The complete mature protein coded by a single open reading frame 
terminating at the TGA stop codon at nucleotide position 42-44 is presumed to be the 
- translation initiation site since the subsequent sequence codes for a typical signal 

10 peptide sequence. 

Amino Acid Sequence Analysis 

The amino acid sequence of Per f I predicted by nucleotide analysis is 
shown in Figures 10A and 10B. As shown in the composite alignment of the amino 

15 acid sequence of mature Perp I and Perf l (Figure 11), high homology was observed 
between the two proteins. Sequence homology analysis revealed that the Per f I 
protein showed 81% homology with the D£L£ I protein as predicted by previous 
conventional amino acid sequencing. In particular, the residues making up the active 
side of Dslp I, based on those determined for papain, actinidin, cathepsin H, and 

20 cathepsin B, are also conserved in the DerJf I protein. The residues are glutamine 

(residue 29), glycine, serine and cysteine (residues 33-35), histidine (residue 171) and 
asparagine, serine and tryptophan (residues 191-193) where the numbering refers to 
Per f I. The predicted mature Perf l amino acid sequence contains a potential N- 
; glycosylation site (Asn-Thr-Ser) at position 53-55 which is also present as Asn-Gln- 

25 ^Ser at the equivalent position in DSLP I- 

Analysis of the predicted amino acid sequence of the entire Perf l cPNA 
insert has shown that, as for other cysteine proteases (Figures 12A and 12B), the Pgr f 
I protein has pre- and proform intermediates. As previously mentioned, the 
methionine residue at position -98 is presumed to be the initiation methionine. This 

30 assumption is based on the fact that firstly, the 5' proximal end sequence from residues 
-98 to -81 is composed predominantly of hydrophobic amino £cid residues (72%), 
which is the characteristic feature of signal peptides (Von Heijne, G., EMBO 
1:2315-2323 (1984)). Secondly, the lengths of the presumptive pre- (18 amino acid 
residues) and pro-peptides (80 residues) are similar to those for other cysteine 

35 proteases (Figures 12A and 12B). Most cysteine proteases examined have about 120 
preproenzyme residues (of which an average of 19 residues form the signal peptide) 
with cathepsin B the smallest with 80 (Ishidoh, K. £LaL FEBS Letters, 226:32-37 
(1987)). Perf l falls within this range with a total of 98 preproenzyme residues. 
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By following the method for predicting signal-sequence cleavage 
sites outlined in Von Heijne, it is proposed that cleavage from the pre-Der_f I 
sequence for proenzyme formation occurs at the signal peptidase cleavage site 
lying between Ala (-8 1 ) and Arg (-80) (Von Heijne, G., Eur. J. Biochem.. 122: 1 7- 
5 21 (1988) and J. Mol. Biol. . 1M:99-105 (1985)). Thus, the sequence from 
residues -98 to -81 codes for the leader peptide while the proenzyme moiety of 
Dsr_f I begins at residue Arg (-80) and ends at residue Glu (-1)'. 

EXAMPLE 7 

10 Isolation and Characterization of cDNA Coding for Per f II 
MATERIALS AND METHODS 
Amino acid sequence analysis 
Preparation of ggtl 1 D. farinae cDNA ligations 

D. farinae was purchased from Commonwealth Serum Laboratories, 

15 Parkville, Australia, and used to prepare mRNA (polyadenylated RNA) as 

described (Stewart, G.A. and W JL Thomas, Int Arch. A llergy Appl Immunol.. 
£2:384-389 (1987)). The mRNA was suspended at approximately 0.5mg/ml and 
5mg used to prepare cDNA by the RNase H method (Gubler, U. and Hoffman, 
B.J., Gene . 25_:263-269 (1983)) using a kit (Amersham International, Bucks). 

20 EcoRI linkers (Amersham, GGAATTCC) were attached according to the method 
described by Huynh £LaL, Constructing and screening cDNA libraries in gtlO and 
gt 1 1 , In: Glover, DNA Cloning vol. A practical approach pp. 47-78 IRL Press, 
Oxford (1985)). The DNA was men digested with EcoRI and recovered from an 
agarose gel purification by electrophoresis into a DEAE membrane (Schleicher 

25 and Schuell, Dassel, FRG, NA-45) according to protocol 6.24 of Sambrook fiLaL, 
(Sambrook et al. . Molecular Cloning: A Laboratory Manual, 2d Ed., Cold Spring 
Harbor Laboratory Press (1989)), except 0.5M arginine base was used for elution. 
The cDNA was then ligated in ggtlO and ggtl 1 at an arms to insert ratio of 2: 1 . 
Some was packaged for plaque libraries and an aliquot retained for isolating 

30 sequences by polymerase chain reaction as described below. 

Isolation of Per f II cDN A by Polymerase Chain Reaction 

To isolate Dgrjf II cDNA, an oligonucleotide primer based on the N- 
terminal sequence of Per p II was made because their amino acid residues are 
35 identical in these regions (Heymann, P.W. gt al„ J. Allergy Clin. Immunol.. 
£2:1055-1087 (1989)). The primer GGATCCGATCAACTCGATGC-3' was 
used. The first GGATCC encodes a RamHl site and the following sequence 
GAT... encodes the first four residues of UsLp. II. For the other primer the ggtl 1 
TTG AC ACC AG ACC AACTGGTAATG-3 ' reverse primer flanking the EcoRI 
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cloning site was used (New England Biolabs, Beverly, MA). The Per p II primer 
was designed to have approximately 50-60% G-C and to end on the first or 
second, rather than the third, base of a codon (Gould, S.J. et al„ 
Prnr. Natl Acad. Sci. . £6_:1934-1938 (1989); Summer, R. and D. Tautz, 

5 Nucleic Acid Res. . 11:6749 (1989)). 

The PCR reactions were carried out in a final reaction volume of 25 
ml containing 67mM Tris-HCL (pH8.8 at 25°C), 16.6mM (NH4)2SC>4, 40mM 
dNTPs, 5mM 2-mercaptoethanol, 6mM EDTA, 0.2mg/ml gelatin, 2mM MgCl2, 
lOpmoIes of each primer and 2 units of Taq polymerase. Approximately O.OOlmg 

10 of target DNA was added and the contents of the tube were mixed and overlayed 
with paraffin oil. The tubes were initially denatured at 95°C for 6 minutes, then 
annealed at 55°C for 1 minute and extended at 72°C for 2 minutes. Thereafter for 
38 cycles, denaturing was carried out for 30 seconds and annealing and extension 
as before. In the final (40th) cycle, the extension reacton was increased to 10 

15 minutes to ensure that all amplified products were full length. The annealing 

temperature was deliberately set slightly lower than the Tm of the oligonucleotide 
primers (determined by the formula Tm=69.3 + 0.41 (G+C%>650/oligo length) to 
allow for mismatches in the N-terminal primer. 

5ml of the reaction was then checked for amplified bands on a 1% 

20 agarose gel. The remainder of the reaction mixture was extracted with chloroform 
to remove all of the paraffin oil and ethanol precipitated prior to purification of 
the amplified product on a low melting point agarose gel (Bio-Rad, Richmond, 
CA). 

25 Suhcloning of PCR Product 

j: The ends of the purified PCR product were filled in a reaction 

containing 10 mM Tris HC1, 10 mM MgCl2, 50 mM NaCl, 0.025 mM dNTP and 
. 1ml of Klenow enzyme in a final volume of 100ml. The reaction was* carried out 
at 37°C for 15 minutes and heat inactivated at 70°C for 10 minutes. The mixture 

30 was Leder phenol extracted before ethanol precipitation. The resulting blunt 
ended DNA was ligated into M13mpU8 digested with Sma I in a reaction 
containing 0.5M ATP, 1 X ligase buffer and 1 unit of T4 ligase at 15°C for 24 hrs 
and transformed into ILcoli TGI made competent by the CaCl2 method. The 
transformed cells were plated out as a lawn on L + G plates and grown overnight 

35 at 37°C. 
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Preparation of Single-stranded DNA Templat e for Sequencing 

Isolated white plaques were picked using an orange stick into 2.5 ml 
of an overnight culture of TGI cells diluted 1 in 100 in 2 X TY broth, and grown 
at 37°C for 6 hours. The cultures were pelleted and the supernatant removed to a 
5 fresh tube. To a 1ml aliquot of this supernatant 270ml of 20% polyethylene 
glycol, 2.5M NaCl was added and the tube was vortexed before allowing it to 
stand at room temperature (RT) for 15 minutes. This was then spun down again 
and all traces of the supernatant were removed from the tube. The pellet was then 
resuspended in 100ml of 1 X TE buffer. At least 2 phenohTE extractions were 
- io done; followed by 1 Leder phenol extraction and a GHCP3 extraction. The DNA 
was precipitated in ethanol and resuspended in a final volume of 20ml of TE 
buffer. 



DNA Analysis 

15 DNA sequencing was performed with the dideoxynucleotide chain 

termination (Sanger, F. sLal, Proc Natl. Acad. Sci.. 24:5463-5467 (1977)) using 
DNA produced from M13 derived vectors mpl8 and mpl9 in E. coli TGI and T4 
DNA polymerase (Sequenase version 2.0, USB Corp., Cleveland, Ohio; 
Restriction endonucleases were from Toyobo, (Osaka, Japan). All general 

20 procedures were by standard techniques (Sambrook, J. et aL. A Laboratory 
Manual, 2d Ed. Cold Spring Harbor Laboratory Press (1989)). The sequence 
analysis was performed using the Mac Vector Softwar? (IBI, New Haven, CT). 

RESULTS 

25 D- forinae cDNA iigated in ggtl 1 was used to amplify a sequence 

using an oligonucleotide primer with homology to nucleotides coding for the 4 N- 
terminal residues of Per p II and a reverse primer for the ggtl 1 sequence flanking 
the coding site. Two major bands of about 500 bp and 300 bp were obtained 
when the product was gel electrophoresed. These were Iigated into M13 mpl8 

30 and a number of clones containing the 500 bp fragment were analyzed by DNA 
sequencing. Three clones produced sequence data from the N-terminal primer 
end and one from the other orientation. Where the sequence data from the two 
directions overlapped, a complete match was found. One of the clones read from 
the N-terminal primer, contained a one-base deletion which shifted the reading 

35 frame. It was deduced to be a copying error, as the translated sequence from the 
other two clones matched the protein sequence for the first 20 amino acid residues 
of the allergen. 



WO 94/05790 



PCT/US93/08518 



-28- 

The sequence of the clones showing consensus and producing a correct 
reading frame is shown in Figure 14, along with the inferred amino acid sequence. It 
coded for a 129 residue protein with no N-glycosylation site and a calculated 
molecular weight of 14,02 1 kD. No homology was found when compared to other 

5 proteins on the GenBank data base (61 .0 release). It did, however, show 88% amino 
acid residue homology with Per p II shown in the alignment in Figures 16A, 16B, and 
16C. Seven out of the 16 changes were conservative. The conserved residues also 
include all the cysteines present at positions 8, 2 1 , 27, 73 and 1 1 9. There was also 
considerable nucleotide homology, although the restriction enzyme map generated 
10 from the sequence data for commonly used enzymes is different from Deup II (Figure 

3 15). The hydrophobicity plots of the translated sequence of DfiLf II and Perp II 
shown in Figures 17A and 17B are almost identical. 

EXAMPLE 8 

15 

Determination of Nucleotide Seq uence Polymorphisms in 
the Per p L Per p II and Per f II Allergens 

It was expected that there were sequence polymorphisms in the nucleic 
20 acid sequence coding for Perp I, Perp II, Dfinf I and Dfinf II, due to natural allelic 
variation among individual mites. Several nucleotide and resulting amino acid 
sequence polymorphisms were discovered during the sequencing of different Pgr p I, 
Perp II and Dfinf II clones. The amino acid sequence polymorphisms are shown in 
Figures 18, 19 and 20. 
-.25.., The original Per p I ggtl 1 cDNA library was reprobed with cDNA 

obtained from the ggt 1 1 p 1 ( 1 3T) clone to identify new clones. Similarly, the ggtl 1 
cPNA library of Per p II was reprobed with cDNA obtained from the ggtl 1 pII(Cl) 
clone to identify additional Perp II clones. These clones were isolated, sequenced 
and found to contain nucleotide and resulting amino acid sequence polymorphisms 
30 (see Fig. 18 and 19). 

Four Perp I clones, (b), (c), (d) and (e) were sequenced, as shown in Fig. 
1 8. Clone Per p 1(d) was found to contain the following polymorphisms relative to 
the clone Perp 1(a) sequence: (1) the codon for amino acid residue 136 was ACC 
rather than AGC, which results in a predicted amino acid substitution of Thr for Ser; 
35 (2) the codon for amino acid residue 149 had a silent mutation, GCT rather than GCA; 
and (3) the codon for amino acid residue 215 was CAA rather than GAA; which 
results in a predicted amino acid substitution of Gin for Glu. 
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The Per p II clones, Per p 11(1) and Per p 11(2) were sequenced as 
shown in Figure 19. Clone Perp 11(2) was found to have the codon TCA, rather 
than AC A at amino acid residue 47, which results in a predicted amino acid 
substitution of Ser for Thr. This clone also was found to have the codon AAT at 

5 amino acid residue 1 13 rather than GAT, which results in a predicted amino acid 
substitution of Asn for Asp. The codon for amino acid 127 of this clone was 
found to be CTC rather than ATC. This change in codon 127 results in a 
predicted amino acid substitution of Leu for He. 

Additional Perf II cPNA clones containing nucleic acid and 

10 resulting amino acid sequence polymorphisms were obtained from PGR reactions 
using cDNA prepared with RNA isolated from P. farinae mites (Commonwealth 
Serum Laboratories, Parksville, Australia). cPNA was prepared and ligated in 
ggtlO as previously described (Trudinger £t aL (1991) Clin, Exp. Allergy 21:33- 
37). The clones described below were isolated following PCR of the ggtlO library 

15 using a 5' primer, which had the sequence 5-GGATCCGATCAAGTCGATGT-3'. 
The nucleotides S'-GGATCCO' of the 5' primer correspond to a Earn HI 
endonuclease site added for cloning purposes. The remaining nucleotides of the 5' 
primer, 5-GATCAAGTCGATGT-3' correspond to the first 4 amino acids of 
Perp II (Chua £t aL (1990) Int. Arch. Allergy Clin. Immunol. 21:1 1 8-123) as 

20 described in Trudinger st aL ((1991) Clin. Rxp. Allergy 21:33-37). The 3' primer, 
which has the sequence 5'-TTGACACCAGACCAACTGGTAATG-3\ 
corresponds to a sequence of the ggt 1 0 cloning vector (Trudinger sL aL supra). 

PCR was performed as described (Trudinger st aL supra ) and four 
Per f II clones, MT3, MT5, MT16 and MT18, were sequenced, as shown in 

25 Figure 20. Three clones were sequenced that had potential polymorphisms 

relative to the published Per f II sequence (Trudinger fit aL Slipia). The codon for 
amino acid 52 of clone MT 1 8 was ATT rather than the published ACT (Trudinger 
St aL supra). This change in codon 52 of clone MT1 8 would result in a predicted 
amino acid change from Thr to He. Clone MT5 contained three changes from the 

30 published sequence (Trudinger fit aL Slipra): (1) the codon for amino acid 1 1 was 
AGC rather than the published AAC (Trudinger gt aL supra), which results in a 
predicted amino acid substitution of Ser for Asn; (2) the codon for amino acid 52 
was ATT, rather than the published ACT (Trudinger fit aL supra ), which results in 
a predicted amino acid substitution of He for Thr; and (3) the codon for amino 

35 acid 88 was ATC rather than the published GCC (Trudinger fit aL supra), which 
results in a predicted amino acid substitution of He for Ala. Clone MT16 had a 
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silent mutation in the codon for amino acid 68 (ATC versus the published ATT 
(Trudinger et aL supra ) that did not change the predicted amino acid at this 
residue. The following substitutes were also observed by Yuuki et aL 
(Jpn J.Allerg ol. 6:557-561, 1990); He at residue 52, He at residue 54 and He at 
5 residue 88. 

Equivalents 

Those skilled in the art will recognize, or be able to ascertain using no 
more than routine experimentation, many equivalents to the specific embodiments 
10 * oflhe invention described herein. Such equivalents are intended to be 
^encompassed by the following claims. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

5 

(i) (i) APPLICANT: 

(A) NAME: IMMULOGIC PHARMACEUTICAL CORPORATION 

(B) STREET: 610 LINCOLN STREET 

(C) CITY: WALTHAM 

10 (D) STATE: MASSACHUSETTS 

(E) COUNTRY: USA 

(F) POSTAL CODE (ZIP) : 02154 

(G) TELEPHONE: (617) 466-6000 

(H) TELEFAX: (617) 466-6010 

15 

(ii) TITLE OF INVENTION: CLONING AND SEQUENCING OF ALLERGENS FROM 

DERMATOPHAGOIDES (HOUSE DUST MITES) 

(iii) NUMBER OF SEQUENCES: 13 

20 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE : LAHIVE & COCKFIELD 

(B) STREET: 60 STATE STREET, SUITE 510 

(C) CITY: BOSTON 
25 (D) STATE: MA 

(E) COUNTRY: USA 

(F) ZIP: 02109 

(v) COMPUTER READABLE FORM: 
30 (A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: ASCII TEXT 

35 (vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION : 
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(vii) PRIOR APPLICATION DATA: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/945,288 

(B) FILING DATE: 10 SEPTEMBER 1992 

5 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (617) 227-7400 

(B) TELEFAX: (617) 227-5941 

10 ! &K'(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 834 base pairs 

(B) TYPE: nucleic acid 
15 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



20 



25 



( ix) FEATURE : 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..738 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 



AAA AAC CGA TTT TTG ATG AGT GCA GAA GCT TTT GAA CAC CTC AAA ACT 48 
Lys Asn Arg Phe Leu Met Ser Ala Glu Ala Phe Glu His Leu Lys Thr 
30 -23 -20 -15 -10 

CAA TTC GAT TTG AAT GCT GAA ACT AAC GCC TGC AGT ATC AAT GGA AAT 96 

Gin Phe Asp Leu Asn Ala Glu Thr Asn Ala Cys Ser lie Asn Gly Asn 
-5 -11 5 

35 
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GCT CCA GCT GAA ATC GAT TTG CGA CAA ATG CGA ACT GTC ACT CCC ATT 144 

Ala Pro Ala Glu lie Asp Leu Arg Gin Met Arg Thr Val Thr Pro lie 
10 15 20 25 

5 CGT ATG CAA GGA GGC TGT GGT TCA TGT TGG GCT TTC TCT GGT GTT GCC 192 
Arg Met Gin Gly Gly Cys Gly Ser Cys Trp Ala Phe Ser Gly Val Ala 
30 35 40 

GCA ACT GAA TCA GCT TAT TTG GCT CAC CGT AAT CAA TCA TTG GAT CTT 240 
10 Ala Thr Glu Ser Ala Tyr Leu Ala His Arg Asn Gin Ser Leu Asp Leu 

45 50 55 

GCT GAA CAA GAA TTA GTC GAT TGT GCT TCC CAA CAC GGT TGT CAT GGT 288 
Ala Glu Gin Glu Leu Val Asp Cys Ala Ser Gin His Gly Cys His Gly 
15 60 65 70 

GAT ACC ATT CCA CGT GGT ATT GAA TAC ATC CAA CAT AAT GGT GTC GTC 336 
Asp Thr lie Pro Arg Gly lie Glu Tyr lie Gin His Asn Gly Val Val 
75 80 85 

20 

CAA GAA AGC TAC TAT CGA TAC GTT GCA CGA GAA CAA TCA TGC CGA CGA 384 
Gin Glu Ser Tyr Tyr Arg Tyr Val Ala Arg Glu Gin ; Ser Cys Arg Arg 
90 95 100 105 

25 CCA AAT GCA CAA CGT TTC GGT ATC TCA AAC TAT TGC CAA ATT TAC CCA 432 
Pro Asn Ala Gin Arg Phe Gly lie Ser Asn Tyr Cys Gin lie Tyr Pro 
110 115 120 

CCA AAT GCA AAC AAA ATT CGT GAA GCT TTG GCT CAA ACC CAC AGC GCT 480 
30 Pro Asn Ala Asn Lys lie Arg Glu Ala Leu Ala Gin Thr His Ser Ala 
125 130 135 



35 



ATT GCC GTC ATT ATT GGC ATC AAA GAT TTA GAC GCA TTC CGT CAT TAT 
lie Ala Val lie lie Gly lie Lys Asp Leu Asp Ala Phe Arg His Tyr 
140 145 150 



528 




WO 94/05790 PCT/US93/08S18 

-38- 

GAT GGC CGA ACA ATC ATT CAA CGC GAT AAT GGT TAC CAA CCA AAC TAT 576 
Asp Gly Arg Thr lie lie Gin Arg Asp Asn Gly Tyr Gin Pro Asn Tyr 
155 160 165 

5 CAC GCT GTC AAC ATT GTT GGT TAC AGT AAC GCA CAA GGT GTC GAT TAT 624 
His Ala Val Asn He Val Gly Tyr Ser Asn Ala Gin Gly Val Asp Tyr 
170 175 180 185 

TGG ATC GTA CGA AAC AGT TGG GAT ACC AAT TGG GGT GAT AAT GGT TAC 672 
10 Trp lie Val Arg Asn Ser Trp Asp Thr Asn Trp Gly Asp Asn Gly Tyr 

190 195 200 

GGT TAT TTT GCT GCC AAC ATC GAT TTG ATG ATG ATT GAA GAA TAT CCA 720 
Gly Tyr Phe Ala Ala Asn He Asp Leu Met Met He Glu Glu Tyr Pro 
15 205 2i0 215 

TAT GTT GTC ATT CTC TAAACAAAAA GACAATTTCT TATATGATTG TCACTAATTT 775 
Tyr Val Val He Leu 
220 



20 



ATTTAAAATC AAAATTTTTT AGAAAATGAA TAAATTCATT CACAAAAATT AAAAAAAAA 834 



(2) INFORMATION FOR SEQ ID NO: 2: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 5 amino acids 
<B) TYPE: amino acid 
(D) TOPOLOGY: linear 

30 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 
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Lys Asn Arg Phe Leu Met Ser Ala Glu Ala Phe Glu His Leu Lys Thr 

-23 -20 -15 -10 

5 Gin Phe Asp Leu Asn Ala Glu Thr Asn Ala Cys Ser lie Asn Gly Asn 
-5 -11 5 

Ala Pro Ala Glu lie Asp Leu Arg Gin Met Arg Thr Val Thr Pro lie 
10 15 20 25 

10 ~ 

Arg Met Gin Gly Gly Cys Gly Ser Cys Trp Ala Phe Ser Gly Val Ala 
30 35 40 

Ala Thr Glu Ser Ala Tyr Leu Ala His Arg Asn Gin Ser Leu Asp Leu 
15 45 50 55 

Ala Glu Gin Glu Leu Val Asp Cys Ala Ser Gin His Gly Cys His Gly 
60 65 70 

20 Asp Thr He Pro Arg Gly He Glu Tyr He Gin His Asn Gly Val Val 
75 80 85 

Gin Glu Ser Tyr Tyr Arg Tyr Val Ala Arg Glu Gin Ser Cys Arg Arg 
90 95 100 105 

25 

Pro Asn Ala Gin Arg Phe Gly He Ser Asn Tyr Cys Gin He Tyr Pro 
110 115 120 

Pro Asn Ala Asn Lys He Arg Glu Ala Leu Ala Gin Thr His Ser Ala 
30 125 130 135 



He Ala Val He He Gly He Lys Asp Leu Asp Ala Phe Arg His Tyr 
140 145 / 150 
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Asp Gly Arg Thr lie lie Gin Arg Asp Asn Gly Tyr Gin Pro Asn Tyr 
155 160 165 

His Ala Val Asn He Val Gly Tyr Ser Asn Ala Gin Gly Val Asp Tyr 
5 170 175 180 185 

Trp He Val Arg Asn Ser Trp Asp Thr Asn Trp Gly Asp Asn Gly Tyr 
190 195 200 

10 " Gly" Tyr Phe Ala Ala Asn He Asp Leu Met Met He Glu Glu Tyr Pro 
205 210 215 

Tyr Val Val He Leu 
220 

15 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 588 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: cDNA 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 69.. 509 

30 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

CACAAATTCT TCTTTCTTCC TTACTACTGA TCATTAATCT GAAAACAAAA CCAAACAAAC 60 

5 CATTCAAA ATG ATG TAC AAA ATT TTG TGT CTT TCA TTG TTG GTC GCA GCC 110 
Met Tyr Lys lie Leu Cys Leu Ser Leu Leu Val Ala Ala 
-16 -15 -10 -5 

GTT OCT CGT GAT CAA GTC GAT GTC AAA GAT TGT GCC AAT CAT GAA ATC 158 
10 Val Ala Arg Asp Gin Val Asp Val Lys Asp Cys Ala Asn His Glu lie 
-11 5 10 

AAA AAA GTT TTG GTA CCA GGA TGC CAT GGT TCA GAA CCA TGT ATC ATT 206 
Lys Lys Val Leu Val Pro- Gly Cys His Gly Ser Glu Pro Cys He He 
15 15 20 25 

CAT CGT GGT AAA CCA TTC CAA TTG GAA GCC GTT -TTC GAA GCC AAC CAA 254 
His Arg Gly Lys Pro Phe Gin Leu Glu Ala Val Phe Glu Ala Asn Gin 
30 35 40 45 

20 

AAC ACA AAA ACG GCT AAA ATT GAA ATC AAA GCC TCA ATC GAT GGT TTA 302 
Asn Thr Lys Thr Ala Lys He Glu He Lys Ala Ser lie Asp Gly Leu 
50 55 60 

25 GAA GTT GAT GTT CCC GGT ATC GAT CCA AAT GCA TGC CAT TAC ATG AAA 350 
Glu Val Asp Val Pro Gly He Asp Pro Asn Ala Cys His Tyr Met Lys 
65 70 75 

TGC CCA TTG GTT AAA GGA CAA CAA TAT GAT ATT AAA TAT ACA TGG AAT 398 
30 Cys Pro Leu Val Lys Gly Gin Gin Tyr Asp He Lys Tyr Thr Trp Asn 
80 85 90 
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GTT CCG AAA ATT GCA CCA AAA TCT GAA AAT GTT GTC GTC ACT GTT AAA 4 46 

Val Pro Lys He Ala Pro Lys Ser Glu Asn Val Val Val Thr Val Lys 
95 100 105 

5 GTT ATG GGT GAT GAT GGT GTT TTG GCC TGT GCT ATT GCT ACT CAT GCT 4 94 

Val Met Gly Asp Asp Gly Val Leu Ala Cys Ala He Ala Thr His Ala 
110 H5 120 125 

AAA ATC CGC GAT TAAATAAACA AAATTTATTG ATTTTGTAAT CACAAATGAT 546. 
10 Lys lie Arg Asp 



15 



TGATTTTCTT TCCAAAAAAA AAATAAATAA AATTTTGGGA AT 588 



(2) INFORMATION FOR SEQ ID N0:4: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 146 amino acids 
20 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 

Met Met Tyr Lys lie Leu Cys Leu Ser Leu Leu Val Ala Ala Val Ala 
-16 -15 -10 -5 



30 Arg Asp Gin Val Asp Val Lys Asp Cys Ala Asn His Glu He Lys Lys 

-11 5 10 15 

Val Leu Val Pro Gly Cys His Gly Ser Glu Pro Cys He He His Arg 

20 25 30 
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Gly Lys Pro Phe Gin Leu Glu Ala Val Phe Glu Ala Asn Gin Asn Thr 
35 40 45 

5 Lys Thr Ala Lys lie Glu lie Lys Ala Ser He Asp Gly Leu Glu Val 
50 55 60 

Asp Val -Pro Gly He Asp Pro Asn Ala Cys His Tyr Met Lys Cys Pro 
65 70 75 

10 ~" 

Leu Val Lys Gly Gin Gin Tyr Asp He Lys Tyr Thr Trp Asn Val Pro 
80 85 90 95 

Lys He Ala Pro Lys Ser Glu Asn Val Val Val Thr Val Lys Val Met 
15 100 105 no 

Gly Asp Asp Gly Val Leu Ala Cys Ala He Ala Thr His Ala Lys He 
115 120 125 

20 Arg Asp 



(2) INFORMATION FOR SEQ ID NO: 5: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1072 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 

30 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 36.. 1001 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

CGTTTTCTTC CAT C AAAATT AAAAATTCAT CAAAA ATG AAA TTC GTT TTG GCC 53 

Met Lys Phe Val Leu Ala 

10 r*T" -98 -95 

ATT GCC TCT TTG TTG GTA TTG AGC ACT GTT TAT GCT CGT CCA GCT TCA 101 
He Ala Ser Leu Leu Val Leu Ser Thr Val Tyr Ala Arg Pro Ala Ser 
-90 -85 -80 

15 

ATC AAA ACT TTT GAA GAA TTC AAA AAA GCC TTC AAC AAA AAC TAT GCC 149 
He Lys Thr Phe Glu Glu Phe Lys Lys Ala Phe Asn Lys Asn Tyr Ala 
-75 -70 -65 

20 ACC GTT GAA GAG GAA GAA GTT GCC CGT AAA AAC TTT TTG GAA TCA TTG 197 
Thr Val Glu Glu Glu Glu Val Ala Arg Lys Asn Phe Leu Glu Ser Leu 
-60 -55 -50 -45 

AAA TAT GTT GAA GCT AAC AAA GGT GCC ATC AAC CAT TTG TCC GAT TTG 245 
25 Lys Tyr Val Glu Ala Asn Lys Gly Ala He Asn His Leu Ser Asp Leu 

-40 -35 -30 

TCA TTG GAT GAA TTC AAA AAC CGT TAT TTG ATG AGT GCT GAA GCT TTT 293 
Ser Leu Asp Glu Phe Lys Asn Arg Tyr Leu Met Ser Ala Glu Ala Phe 
30 -25 -20 -15 

GAA CAA CTC AAA ACT GAA TTC GAT TTG AAT GCC GAA ACA AGC GCT TGC 341 
Glu Gin Leu Lys Thr Gin Phe Asp Leu Asn Ala Glu Thr Ser Ala Cys 
. -10 -5 -1 1 
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CGT ATC AAT TCG GTT AAC GTT CCA TCG GAA TTG GAT TTA CGA TCA CTG 38 9 

Arg lie Asn Ser Val Asn Val Pro Ser Glu Leu Asp Leu Arg Ser Leu 
5 10 15 20 

5 

CGA ACT GTC ACT CCA ATC CGT ATG CAA GGA GGC TGT GGT TCA TGT TGG 437 
. Arg Thr Val Thr Pro lie Arg Met Gin Gly Gly Cys Gly Ser Cys Trp 

25 30 35 

10 GCT TTC TCT GGT GTT GCC GCA ACT GAA TCA GCT TAT TTG GCC TAC CGT 485 
Ala Phe Ser Gly Val Ala Ala Thr Glu Ser Ala Tyr Leu Ala Tyr Arg 
40 45 50 

AAC ACG TCT TTG GAT CTT TCT GAA CAG GAA CTC GTC GAT TGC GCA TCT 533 
15 Asn Thr Ser Leu Asp Leu Ser Glu Gin Glu Leu Val Asp Cys Ala Ser 
55 60 65 

CAA CAC GGA TGT CAC GGC GAT ACA ATA CCA AGA GGC ATC GAA TAC ATC 581 
Gin His Gly Cys His Gly Asp Thr He Pro Arg Gly He Glu Tyr He 
20 70 75 80 

CAA CAA AAT GGT GTC GTT GAA GAA AGA AGC TAT CCA TAC GTT GCA CGA 629 
Gin Gin Asn Gly Val Val Glu Glu Arg Ser Tyr Pro Tyr Val Ala Arg 
85 90 95 100 



25 



GAA CAA CGA TGC CGA CGA CCA AAT TCG CAA CAT TAC GGT ATC TCA AAC 677 
Glu Gin Arg Cys Arg Arg Pro Asn Ser Gin His Tyr Gly lie Ser Asn 
105 HO 115 



30 



TAC TGC CAA ATT TAT CCA CCA GAT GTG AAA CAA ATC CGT GAA GCT TTG 
Tyr Cys Gin He Tyr Pro Pro Asp Val Lys Gin He Arg Glu Ala Leu 
120 125 130 



725 
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ACT CAA ACA CAC ACA 
Thr Gin Thr His Thr 
135 

5 AGA GCT TTC CAA CAT 
Arg Ala Phe Gin His 
150 



-46- 

GCT ATT GCC GTC ATT ATT 
Ala He Ala Val He He 
140 

TAT GAT GGA CGA ACA ATC 
Tyr Asp Gly Arg Thr He 
155 



GGC ATC AAA GAT TTG 773 
Gly He Lys Asp Leu 
145 

ATT CAA CAT GAC AAT 821 

He Gin His Asp Asn 

160 



GGT TAT CAA CCA AAC TAT CAT GCC GTC AAC ATT GTC GGT TAC GGA AGT 869 
10 :Gly Tyr Gin Pro Asn Tyr His Ala Val Asn He Val Gly Tyr Gly Ser 
;165 170 175 180 



ACA CAA GGC GAC GAT TAT TGG ATC GTA CGA AAC AGT TGG GAT ACT ACC 917 
Thr Gin Gly Asp Asp Tyr Trp He Val Arg Asn Ser Trp Asp Thr Thr 
15 185 190 195 



20 



TGG GGA GAT AGC GGA TAC GGA TAT TTC CAA GCC GGA AAC AAC CTC ATG 965 

Trp Gly Asp Ser Gly Tyr Gly Tyr Phe Gin Ala Gly Asn Asn Leu Met 
200 205 210 

ATG ATC GAA CAA TAT CCA TAT GTT GTA ATC ATG TGAACATTTG AAATTGAATA 1018 
Met He Glu Gin Tyr Pro Tyr Val Val He Met 
215 220 



25 TATTTATTTG TTTTCAAAAT AAAAACAACT ACTCTTGCGA GTATTTTTTA CTCG 1072 



(2) INFORMATION FOR SEQ ID N0:6: 



30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 321 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

5 Met Lys Phe Val Leu Ala lie Ala Ser Leu Leu Val Leu Ser Thr Val 
-98 -95 -90 -85 

Tyr Ala Arg Pro Ala Ser He Lys Thr Phe Glu Glu Phe Lys Lys Ala 
-80 -75 -70 

10 ~ 

Phe Asn Lys Asn Tyr Ala Thr Val Glu Glu Glu Glu Val Ala Arg Lys 
-65 -60 -55 

Asn Phe Leu Glu Ser Leu Lys Tyr Val Glu Ala Asn Lys Gly Ala He 
15 -50 -45 -40 -35 

Asn His Leu Ser Asp Leu Ser Leu Asp Glu Phe Lys Asn Arg Tyr Leu 
-30 -25 -20 

20 Met Ser Ala Glu Ala Phe Glu Gin Leu Lys Thr Gin Phe Asp Leu Asn 
-15 -10 -5 

Ala Glu Thr Ser Ala Cys Arg He Asn Ser Val Asn Val Pro Ser Glu 
-11 5 10 

Leu Asp Leu Arg Ser Leu Arg Thr Val Thr Pro lie Arg Met Gin Gly 
15 20 25 30 

Gly Cys Gly Ser Cys Trp Ala Phe Ser Gly Val Ala Ala Thr Glu Ser 
30 35 40 45 

Ala Tyr Leu Ala Tyr Arg Asn Thr Ser Leu Asp Leu Ser Glu Gin Glu 
50 55 60 
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Leu Val Asp Cys Ala Ser Gin His Gly Cys His Gly Asp Thr lie Pro 
65 70 75 

Arg Gly lie Glu Tyr lie Gin Gin Asn Gly Val Val Glu Glu Arg Ser 
5 80 85 90 

Tyr Pro Tyr Val Ala Arg Glu Gin Arg Cys Arg Arg Pro Asn Ser Gin 
95 100 105 110 

10 His" Tyr Gly He Ser Asn Tyr Cys Gin He Tyr Pro Pro Asp Val Lys 

115 120 125 

Gin He Arg Glu Ala Leu Thr Gin Thr His Thr Ala He Ala Val He 
130 135 140 

15 

lie Gly He Lys Asp Leu Arg Ala Phe Gin His Tyr Asp Gly Arg Thr 
145 150 155 

He He Gin His Asp Asn Gly Tyr Gin Pro Asn Tyr His Ala Val Asn 
20 160 165 170 

He Val Gly Tyr Gly Ser Thr Gin Gly Asp Asp Tyr Trp He Val Arg 
175 180 185 190 

25 Asn Ser Trp Asp Thr Thr Trp Gly Asp Ser Gly Tyr Gly Tyr Phe Gin 

195 200 205 

Ala Gly Asn Asn Leu Met Met He Glu Gin Tyr Pro Tyr Val Val He 
210 215 220 

30 

Met 
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(2) INFORMATION FOR SEQ ID NO : 7 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 91 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 



10 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1. .390 



15 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

GAT CAA GTC GAT GTT AAA GAT TGT GCC AAC AAT GAA ATC AAA AAA GTA 48 
20 Asp Gin Val Asp Val Lys Asp Cys Ala Asn Asn Glu lie Lys Lys Val 
1 5 10 15 



ATG GTC GAT GGT TGC CAT GGT TCT GAT CCA TGC ATA ATC CAT CGT GGT 96 
Met Val Asp Gly Cys His Gly Ser Asp Pro Cys He He His Arg Gly 
25 20 - 25 30 

AAA CCA TTC ACT TTG GAA GCC TTA TTC GAT GCC AAC CAA AAC ACT AAA 144 
Lys Pro Phe Thr Leu Glu Ala Leu Phe Asp Ala Asn Gin Asn Thr Lys 
35 40 45 

30 

ACC GCT AAA ACT GAA ATC AAA GCC AGC CTC GAT GGT CTT GAA ATT GAT 192 
Thr Ala Lys Thr Glu He Lys Ala Ser Leu Asp Gly Leu Glu He Asp 

50 55 / 60 
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GTT CCC GGT ATT GAT ACC AAT GCT TGC CAT TTT ATG AAA TGT CCA TTG 24 0 

Val Pro Gly lie Asp Thr Asn Ala Cys His Phe Met Lys Cys Pro Leu 

65 70 75 80 

5 GTT AAA GGT CAA CAA TAT GAT GCC AAA TAT ACA TGG AAT GTG CCC AAA 288 
Val Lys Gly Gin Gin Tyr Asp Ala Lys Tyr Thr Trp Asn Val Pro Lys 
85 90 95 

ATT GCA CCA AAA TCT GAA AAC GTT GTC GTT ACA GTC AAA CTT GTT GGT 336 
10 lie Ala Pro Lys Ser Glu Asn Val Val Val Thr Val Lys Leu Val Gly 
100 105 110 

GAT AAT GGT GTT TTG GCT TGC GCT ATT GCT ACC CAC GCT AAA ATC CGT 3 84 

Asp Asn Gly Val Leu Ala Cys Ala He Ala Thr His Ala Lys He Arg 
15 115 120 125 

GAT TAAAAAAAAA AAATAAATAT GAAAATTTTC ACCAACATCG AACAAAATTC 437 
Asp 

130 



20 



25 



30 



AATAACCAAA ATTTGAATCA AAAACGGAAT TCCAAGCTGA GCGCCGGTCG CTAC 4 91 

(2} INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 129 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

Asp Gin Val Asp Val Lys Asp Cys Ala Asn Asn Glu lie Lys Lys Val 
1 5 10 15 

5 

Met Val Asp Gly Cys His Gly Ser Asp Pro Cys lie lie His Arg Gly 
20 25 30 

Lys Pro Phe Thr Leu Glu Ala Leu Phe Asp Ala Asn Gin Asn Thr Lys 
10 ~~ 35 40 45 

Thr Ala Lys Thr Glu He Lys Ala Ser Leu Asp Gly Leu Glu He Asp 
50 55 60 

15 Val Pro Gly He Asp Thr Asn Ala Cys His Phe Met Lys Cys Pro Leu 
65 70 75 80 

Val Lys Gly Gin Gin Tyr Asp Ala Lys Tyr Thr Trp Asn Val Pro Lys 
85 90 95 

20 

He Ala Pro Lys Ser Glu Asn Val Val Val Thr Val Lys Leu Val Gly 
100 105 110 

Asp Asn Gly Val Leu Ala Cys Ala He Ala Thr His Ala Lys He Arg 
25 115 120 125 



Asp 
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(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1172 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..738 



15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:9:* 

GAATTCCTTT TTTTTTCTTT CTCTCTCTAA AATCTAAAAT CCATCCAAC ATG AAA ATT 58 
20 Met Lys lie 

-98 

GTT TTG GCC ATC GCC TCA TTG TTG GCA TTG AGC GCT GTT TAT GCT CGT 106 
Thr Leu Ala lie Ala Ser Leu Leu Ala Leu Ser Ala Val Tyr Ala Arg 
25 r: -95 -90 _ -85 -80 

CCA TCA TCG ATC AAA ACT TTT GAA GAA TAC AAA AAA GCC TTC AAC AAA 154 
Pro Ser Ser He Lys Thr Phe Glu Glu Tyr Lys Lys Ala Phe Asn Lys 
-75 -70 -65 



30 



AGT TAT GCT ACC TTC GAA GAT CAA GAA GCT GCC CGT AAA AAC TTT TTG 202 
Ser Tyr Ala Thr Phe Glu Asp Glu Glu Ala Ala Arg Lys Asn Phe Leu 
-60 -55 -50 
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GAA TCA GTA AAA TAT GTT CAA TCA AAT GGA GGT GCC ATC AAC CAT TTG 2 50 

Glu Ser Val Lys Tyr Val Gin Ser Asn Gly Gly Ala lie Asn His Leu 
-45 -40 -35 

5 TCC GAT TTG TCG TTG GAT GAA TTC AAA AAC CGA TTT TTG ATG AGT GCA 2 98 

Ser Asp Leu Ser Leu Asp Glu Phe Lys Asn Arg Phe Leu Met Ser Ala 
-30 -25 -20 

GAA GCT TTT GAA CAC CTC AAA ACT CAA TTC GAT TTG AAT GCT GAA ACT 346 
10 Glu Ala Phe Glu His Leu Lys Thr Gin Phe Asp Leu Asn Ala Glu Thr 
-15 -10 -5 -1 1 

AAC GCC TGC AGT ATC AAT GGA AAT GCT CCA GCT GAA ATC GAT TTG CGA 394 
Asn Ala Cys Ser lie Asn Gly Asn Ala Pro Ala Glu lie Asp Leu Arg 
15 5 10 15 

CAA ATG CGA ACT GTC ACT CCC ATT CGT ATG CAA GGA GGC TGT GGT TCA 442 
Gin Met Arg Thr Val Thr Pro lie Arg Met Gin Gly Gly Cys Gly Ser 
20 25 30 

20 

TGT TGG GCT TTC TCT GGT GTT GCC GCA ACT GAA TCA GCT TAT TTG GCT 490 
Cys Trp Ala Phe Ser Gly Val Ala Ala Thr Glu Ser Ala Tyr Leu Ala 
35 40 45 

25 CAC CGT AAT CAA TCA TTG GAT-CTT GCT GAA CAA GAA TTA GTC GAT TGT 538 
His Arg Asn Gin Ser Leu Asp Leu Ala Glu Gin Glu Leu Val Asp Cys 
50 55 60 65 

GCT TCC CAA CAC GGT TGT CAT GGT GAT ACC ATT CCA CGT GGT ATT GAA 586 
30 Ala Ser Gin His Gly Cys His Gly Asp Thx lie Pro Arg Gly He Glu 

70 75 80 

TAC ATC CAA CAT AAT GGT GTC GTC CAA GAA AGC/TAC TAT CGA TAC GTT 634 
Tyr He Gin His Asn Gly Val Val Gin Glu Ser Tyr Tyr Arg Tyr Val 
35 85 90 95 
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GCA CGA GAA CAA TCA 
Ala Arg Glu Gin Ser 
100 

5 TCA AAC TAT TGC CAA 
Ser Asn Tyr Cys Gin 
115 



-54- 

TGC CGA CGA CCA AAT GCA 
Cys Arg Arg Pro Asn Ala 
105 

ATT TAC CCA CCA AAT GCA 
lie Tyr Pro Pro Asn Ala 
120 



CAA CGT TTC GGT ATC 6 82 

Gin Arg Phe Gly lie 
110 

AAC AAA ATT CGT GAA 73 0 

Asn Lys lie Arg Glu 

125 



GCT TTG GCT CAA ACC CAC AGC GCT ATT GCC GTC ATT ATT GGC ATC AAA 778 
10 Ala Leu Ala Gin Thr His Ser Ala lie Ala Val lie lie Gly He Lys 
130 135 140 145 

GAT TTA GAC GCA TTC CGT CAT TAT GAT GGC CGA ACA ATC ATT CAA CGC 826 
Asp Leu Asp Ala Phe Arg His Tyr Asp Gly Arg Thr lie He Gin Arg 
15 150 155 160 

GAT AAT GGT TAC CAA CCA AAC TAT CAC GCT GTC AAC ATT GTT GGT TAC 874 
Asp Asn Gly Tyr Gin Pro Asn Tyr His Ala Val Asn He Val Gly Tyr 
165 170 175 

20 

AGT AAC GCA CAA GGT GTC GAT TAT TGG ATC GTA CGA AAC AGT TGG GAT 922 
Ser Asn Ala Gin Gly Val Asp Tyr Trp He Val Arg Asn Ser Trp Asp 
180 185 190 

25 ACC AAT TGG GGT GAT AAT G£T TAC GGT TAT TTT GCT GCC AAC ATC GAT 970 
Thr Asn Trp Gly Asp Asn Gly Tyr Gly Tyr Phe Ala Ala Asn He Asp 
195 200 205 

TTG ATG ATG ATT GAA GAA TAT CCA TAT GTT GTC ATT CTC TAAACAAAAA 10X9 
30 Leu Met Met He Glu Glu Tyr Pro Tyr Val Val He Leu 
210 215 220 



GACAATTTCT TATATGATTG TCACTAATTT ATTTAAAATC AAAATTTTTA GAAAATGAAT 107 9 
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AAATTCATTC ACAAAAATTA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 113 9 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAA 1172 

5 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 320 amino acids 
10 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: 

Met Lys lie Thr Leu Ala He Ala Ser Leu Leu* Ala Leu Ser Ala Val 
-98 -95 -90 -85 

20 Tyr Ala Arg Pro Ser Ser He Lys Thr Phe Glu Glu Tyr Lys Lys Ala 
-80 -75 -70 

Phe Asn Lys Ser Tyr Ala Thr Phe Glu Asp Glu Glu Ala Ala Arg Lys 
-65 -60 -55 

25 

Asn Phe Leu Glu Ser Val Lys Tyr Val Gin Ser Asn Gly Gly Ala lie 
-50 -45 -40 -35 

Asn His Leu Ser Asp Leu Ser Leu Asp Glu Phe Lys Asn Arg Phe Leu 
30 -30 -25 -20 

Met Ser Ala Glu Ala Phe Glu His Leu Lys Thr Gin Phe Asp Leu Asn 
-15 -10 -5 
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Ala Glu Thr Asn Ala Cys Ser lie Asn Gly Asn Ala Pro Ala Glu lie 
-11 5 10 

Asp Leu Arg Gin Met Arg Thr Val Thr Pro lie Arg Met Gin Gly Gly 
5 15 20 25 30 

Cys Gly Ser Cys Trp Ala Phe Ser Gly Val Ala Ala Thr Glu Ser Ala 
35 40 45 

10 Tyr Leu Ala His Arg Asn Gin Ser Leu Asp Leu Ala Glu Gin Glu Leu 

50 55 60 

Val Asp Cys Ala Ser Gin His Gly Cys His Gly Asp Thr lie Pro Arg 
65 70 75 

15 

Gly lie Glu Tyr lie Gin His Asn Gly Val Val Gin Glu Ser Tyr Tyr 
80 85 90 

Arg Tyr Val Ala Arg Glu Gin Ser Cys Arg Arg Pro Asn Ala Gin Arg 
20 95 100 105 110 

Phe Gly lie Ser Asn Tyr Cys Gin lie Tyr Pro Pro Asn Ala Asn Lys 
115 120 125 

25 lie Arg Glu Ala Leu Ala Gin Thr His Ser Ala lie Ala Val lie lie 
130 135 140 

Gly lie Lys Asp Leu Asp Ala Phe Arg His Tyr Asp Gly Arg Thr lie 
145 150 155 

30 

lie Gin Arg Asp Asn Gly Tyr Gin Pro Asn Tyr His Ala Val Asn He 
160 165 170 
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Val Gly Tyr Ser Asn Ala Gin Gly Val Asp Tyr Tip lie Val Arg Asn 
175 180 185 190 

Ser Trp Asp Thr Asn Trp Gly Asp Asn Gly Tyr Gly Tyr Phe Ala Ala 
5 195 200 205 

. Asn lie Asp Leu Met Met lie Glu Glu Tyr Pro Tyr Val Val lie Leu 
210 215 220 



10 



(2) INFORMATION FOR SEQ ID NO: 11: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 222 amino acids 
15 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

20 (ix) FEATURE : 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 50 

(D) OTHER INFORMATION: /label=Xaa is His or Tyr 

25 (ix) FEATURE: 

(A) NAME/KEY: misc_ feature 

(B) LOCATION: 81 

(D) OTHER INFORMATION: /label=Xaa is Glu or Lys 

30 (ix) FEATURE: 

(A) NAME /KEY: misc_feature 

(B) LOCATION: 124 

(D) OTHER INFORMATION: /label-Xaa is Ala or Val 
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(ix) FEATURE: 

(A) NAME /KEY : misc_f eature 

(B) LOCATION: 136 

(D) OTHER INFORMATION: /label=Xaa is Ser or Thr 

5 

(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 215 

(D) OTHER INFORMATION: /label«Xaa is Glu or Gin 

10 ^ " 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Thr Asn Ala Cys Ser lie Asn Gly Asn Ala Pro Ala Glu lie Asp Leu 
15 1 5 10 15 

Arg Gin Met Arg Thr Val Thr Pro He Arg Met Gin Gly Gly Cys Gly 
20 25 30 

20 Ser Cys Trp Ala Phe Ser Gly Val Ala Ala Thr Glu Ser Ala Tyr Leu 
35 40 45 

Ala Xaa Arg Asn Gin Ser Leu Asp Leu Ala Glu Gin Glu Leu Val Asp 
50 55 60 

25 

Cys Ala Ser Gin His Gly Cys His Gly Asp Thr He Pro Arg Gly He 
65 70 75 80 

Xaa Tyr He Gin His Asn Gly Val Val Gin Glu Ser Tyr Tyr Arg Tyr 
30 85 90 95 



Val Ala Arg Glu Gin Ser Cys Arg Arg Pro Asn Ala Gin Arg Phe Gly 
100 105 HO 
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He Ser Asn Tyr Cys Gin He 
115 

Glu Ala Leu Ala Gin Thr His 
5 130 135 

„ Lys Asp Leu Asp Ala Phe Arg 
145 150 

10 Arg Asp Asn Gly* Tyr Gin Pro 

165 

Tyr Ser Asn Ala Gin Gly Val 
180 

15 

Asp Thr Asn Trp Gly Asp Asn 
195 

Asp Leu Met Met He Glu Xaa 
20 210 215 
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Tyr Pro Pro Asn Xaa Asn Lys He Arg 
120 125 

Xaa Ala He Ala Val He He Gly He 
140 

His Tyr Asp Gly Arg Thr He He Gin 
155 160 

Asn Tyr His Ala Val Asn He Val Gly 
170 175 

Asp Tyr Trp He Val Arg Asn Ser Trp 
185 190 

Gly Tyr Gly Tyr Phe Ala Ala Asn He 
200 205 

Tyr Pro Tyr Val Val He Leu 
220 



(2) INFORMATION FOR SEQ ID NO: 12: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 129 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



30 <ii) MOLECULE TYPE: protein 
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(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 47 

(D) OTHER INFORMATION: /label=Xaa is Thr or Ser 

5 

(ix) FEATURE: 

(A) NAME /KEY : misc_f eature 

(B) LOCATION: 114 

(D) OTHER INFORMATION: /label=Xaa is Asp or Asn 



10 



15 



(ix) FEATURE: 

(A) NAME /KEY : misc_f eature 

(B) LOCATION: 127 

(D) OTHER INFORMATION: / label =Xaa is lie or Leu 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



Asp Gin Val Asp Val Lys Asp Cys Ala Asn His Glu He Lys Lys Val 
1 S 10 15 

20 

Leu Val Pro Gly Cys His Gly Ser Glu Pro Cys He He His Arg Gly 
20 25 30 



Lys Pro Phe Gin Leu Glu Ala Val Phe Glu Ala Asn Gin Asn Xaa Lys 
25 3 5 4 0 45 

Thr Ala Lys He Glu He Lys Ala Ser He Asp Gly Leu Glu Val Asp 
50 55 60 

30 Val Pro Gly He Asp Pro Asn Ala Cys His Tyr Met Lys Cys Pro Leu 
€5 70 75 80 



Val Lys Gly Gin Gin Tyr Asp He Lys Tyr Thr Trp Asn Val Pro Lys 
85 90 95 

35 
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He Ala Pro Lys Ser Glu Asn Val Val Val Thr Val Lys Val Met Gly 
100 105 110 

Asp Xaa Gly Val Leu Ala Cys Ala He Ala Thr His Ala Lys Xaa Arg 
5 115 120 125 



Asp 



10 (2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 129 amino acids 

(B) TYPE: amino acid 
15 (D) TOPOLOGY: linear 



(ix) FEATURE: 
20 (A) NAME /KEY: misc_f eature 

(B) LOCATION: 11 

(D) OTHER INFORMATION: /label=Xaa is Asn or Ser 

(ix) FEATURE: 
25 (A) NAME/KEY: jnisc_f eat tire 

(B) LOCATION: 52 

(D) OTHER INFORMATION: /label -Xaa is Thr or He 

( ix) FEATURE : 
30 (A) NAME/KEY: misc_f eature 

(B) LOCATION: 54 

(D) OTHER INFORMATION: /label=Xaa is lie or Thr 
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(ix) FEATURE: 

(A) NAME / KEY : misc. feature 

(B) LOCATION: 76 

(D) OTHER INFORMATION: /label=Xaa is Met or Val 

(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 88 

(D) OTHER INFORMATION: /labei=Xaa is Ala or lie 

(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 111 

(D) OTHER INFORMATION: /label=Xaa is Val or lie 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 



Asp Gin Val Asp Val Lys Asp Cys Ala Asn Xaa Glu lie Lys Lys Val 
1 5 10 15 

20 

Met Val Asp Gly Cys His Gly Ser Asp Pro Cys He He His Arg Gly 
20 25 30 



Lys Pro Phe Thr Leu Glu Ala Leu Phe Asp Ala Asn Gin Asn Thr Lys 
25 35 40 45 

Thr Ala Lys Xaa Glu Xaa Lys Ala Ser Leu Asp Gly Leu Glu He Asp 
50 55 60 

30 Val Pro Gly He Asp Thr Asn Ala Cys His Phe Xaa Lys Cys Pro Leu 
65 70 75 80 



Val Lys Gly Gin Gin Tyr Asp Xaa Lys Tyr Thr Trp Asn Val Pro Lys 
85 90 95 

35 
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He Ala Pro Lys Ser Glu Asn Val 
100 

Asp Asn Gly Val Leu Ala Cys Ala 
5 115 120 
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Val Val Thr Val Lys Leu Xaa Gly 
105 110 

He Ala Thr His Ala Lys He Arg 
125 



Asp 
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CLAIMS 

1 . A protein allergen of Per p II comprising the amino acid sequence: 

5 Asp Gin Val Asp Val Lys Asp Cys Ala Asn His Glu He Lys Lys Val Leu Val Pro 
Gly Cys His Gly Ser Glu Pro Cys He He His Arg Gly Lys Pro Phe Gin Leu Glu 
. Ala Val Phe Glu Ala Asn Gin Asn Xaai Lys Thr Ala Lys He Glu He Lys Ala Ser 
He Asp Gly Leu Glu Val Asp Val Pro Gly He Asp Pro Asn Ala Cys His Tyr Met 
Lys Cys Pro Leu Val Lys Gly Gin Gin Tyr Asp He Lys Tyr Thr Trp Asn Val Pro 
10 Lys He Ala Pro Lys Ser Glu Asn Val Val Val Thr Val Lys Val Met Gly Xaa2 Asp 
Gly Val Leu Ala Cys Ala He Ala Thr His Ala Lys Xaa3 Arg Asp 

where Xaaj is selected from the group consisting of Thr and Ser; 
where Xaa2 is selected from the group consisting of Asp and Asn; 

15 and 

where Xaa3 is selected from the group consisting of He and Leu, 
except for the amino acid sequence where Xaaj is Thr, Xaa2 is Asp 
and Xaa3 is He. 

20 2. A protein allergen of DsxJ H comprising the amino acid sequence: 

Asp Gin Val Asp Val Lys Asp Cys Ala Asn Xaaj Glu He Lys Lys Val Met Val 
Asp Gly Cys His Gly Ser Asp Pro Cys He He His Arg Gly Lys Pro Phe Thr Leu 
Glu Ala Leu Phe Asp Ala Asn Gin Ash Thr Lys Thr Ala Lys Xaa2 Glu Xaa3 Lys 
25 Ala Ser Leu Asp Gly Leu Glu He Asp Val Pro Gly He Asp Thr Asn Ala Cys His 
Phe Xaa4 Lys Cys Pro Leu Val Lys Gly Gin Gin Tyr Asp Xaas Lys Tyr Thr Trp 
Asn Val Pro Lys He Ala Pro Lys Ser Glu Asn Val Val Val Thr Val Lys Leu Xaag 
Gly Asp Asn Gly Val Leu Ala Cys Ala He Ala Thr His Ala Lys He Arg Asp 

30 where Xaaj is selected from the group consisting of Asn and Ser; 

where Xaa2 * s selected from the group consisting of Thr and He; 

where Xaa3 is selected from the group consisting of He and Thr; 

where Xaa4 is selected from the group consisting of Met and Val; 

where Xaa5 is selected from the group consisting of Ala and He; and 
35 where Xaa^ is selected from the group consisting of Val and He, with 

the proviso that, 

when Xaai is Asn, then Xaa3 is Thr; and 

when Xaa3 is He, then Xaai ls Ser. 
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3. A therapeutic composition comprising a protein allergen of claim 1 
and a pharmaceutically acceptable carrier or diluent. 

4. A method of treatment for sensitivity in an individual to house dust 

5 mites, comprising administering to the individual an effective therapeutic amount 
of a composition of claim 3. 

5. A therapeutic composition comprising a protein allergen of claim 2 
and a pharmaceutically acceptable carrier or diluent. 

0 ~ 

6. A method of treatment for sensitivity in an individual to house dust 
mites, comprising administering to the individual an effective therapeutic amount 
of a composition of claim 5. 
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